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Chapter 1 

Syllabus 



1.1 Letter to Student 1 

To the Student: 

This course and this Student Manual reflect a collective effort by your instructor, the Vietnam Education 
Foundation, the Vietnam Open Courseware (VOCW) Project and faculty colleagues within Vietnam and 
the United States who served as reviewers of drafts of this Student Manual. This course is an important 
component of our academic program. Although it has been offered for many years, this latest version 
represents an attempt to expand the range of sources of information and instruction so that the course 
continues to be up-to-date and the methods well suited to what is to be learned. 

This Student Manual is designed to assist you through the course by providing specific information about 
student responsibilities including requirements, timelines and evaluations. 

You will be asked from time-to-time to offer feedback on how the Student Manual is working and how 
the course is progressing. Your comments will inform the development team about what is working and 
what requires attention. Our goal is to help you learn what is important about this particular field and to 
eventually succeed as a professional applying what you learn in this course. 

Thank you for your cooperation. 

Tuan Do- Hong. 

1.2 Contact Information 2 

Faculty Information: Department of Telecommunications Engineering, Faculty of Electrical and Electron- 
ics Engineering, Ho Chi Minh City University of Technology 

Instructor: Dr.-Ing. Tuan Do-Hong 

Office Location: Ground floor, B3 Building 

Phone: +84 (0) 8 8654184 

Email: do-hong@hcmut.edu. vn 

Office Hours: 9:00 am - 5:00 pm 

Assistants: 

Office Location: Ground floor, B3 Building 

Phone: +84 (0) 8 8654184 

Email: 

Office Hours: 9:00 am - 5:00 pm 

Lab sections/support: 



lr rhis content is available online at <http://cnx.Org/content/ml5429/l. l/>. 
2 This content is available online at <http://cnx.org/content/ml5431/!. 3/>. 



2 CHAPTER 1. SYLLABUS 

1.3 Resources 3 

Connexions: http://cnx.org/ 4 

MIT's OpenCourseWare: http://ocw.mit.edu/index.html 5 

Computer resource: Matlab and Simulink 

Textbook(s): 

Required: 

[1] Bernard Sklar, Digital Communications: Fundamentals and Applications, 2nd edition, 2001, 
Prentice Hall. 

Recommended: 

[2] John Proakis, Digital Communications, 4th edition, 2001, McGraw-Hill. 

[3] Bruce Carlson et al., Communication Systems: An Introduction to Signals and Noise in 
Electrical Communication, 4th edition, 2001, McGraw-Hill. 

[4] Rogger E. Ziemer, Roger W. Peterson, Introduction to Digital Communication, 2nd edition, 
2000, Prenctice Hall. 

1.4 Purpose of the Course 6 

Title: Principles of Digital Communications 

Credits: 3 (4 hours/week, 15 weeks/semester) 

Course Rationale: 

Wireless communication is fundamentally the art of communicating information without wires. In prin- 
ciple, wireless communication encompasses any number of techniques including underwater acoustic com- 
munication, radio communication, and satellite communication, among others. The term was coined in the 
early days of radio, fell out of fashion for about fifty years, and was rediscovered during the cellular telephony 
revolution. Wireless now implies communication using electromagnetic waves - placing it within the domain 
of electrical engineering. Wireless communication techniques can be classified as either analog or digital. The 
first commercial systems were analog including AM radio, FM radio, television, and first generation cellular 
systems. Analog communication is gradually being replaced with digital communication. The fundamental 
difference between the two is that in digital communication, the source is assumed to be digital. Every major 
wireless system being developed and deployed is built around digital communication including cellular com- 
munication, wireless local area networking, personal area networking, and high-definition television. Thus 
this course will focus on digital wireless communication. 

This course is a required core course in communications engineering which introduces principles of digital 
communications while reinforcing concepts learned in analog communications systems. It is intended to 
provide a comprehensive coverage of digital communication systems for last year undergraduate students, 
first year graduate students and practicing engineers. 

Pre-requisites: Communication Systems. Thorough knowledge of Signals and Systems, Linear 
Algebra, Digital Signal Processing, and Probability Theory and Stochastic Processes is essential. 

1.5 Course Description 7 

This course explores elements of the theory and practice of digital communications. The course will 1) 
model and study the effects of channel impairments such as distortion, noise, interference, and fading, on the 
performance of communication systems; 2) introduce signal processing, modulation, and coding techniques 
that are used in digital communication systems. The concepts/ tools are acquired in this course: 



3 This content is available online at <http://cnx.Org/content/ml5438/l. 3/>. 

4 http://cnx.org/ 

5 http://ocw. mit.edu/index. html 

6 This content is available online at <http://cnx.Org/content/ml5433/l. 2/>. 

7 This content is available online at <http://cnx.org/content/ml5435/!. 4/>. 



Signals and Systems 

Classification of signals and systems 

Orthogonal functions, Fourier series, Fourier transform 

Spectra and filtering 

Sampling theory, Nyquist theorem 

Random processes, autocorrelation, power spectrum 

Systems with random input /output 

Source Coding 

Elements of compression, Huffman coding 

Elements of quantization theory 

Pulse code Modulation (PCM) and variations 

Rate/bandwidth calculations in communication systems 

Communication over AWGN Channels 

Signals and noise, Eb/NO 

Receiver structure, demodulation and detection 

Correlation receiver and matched filter 

Detection of binary signals in AWGN 

Optimal detection for general modulation 

Coherent and non-coherent detection 

Communication over Band-limited AWGN Channel 

ISI in band-limited channels 

Zero-ISI condition: the Nyquist criterion 

Raised cosine filters 

Partial response signals 

Equalization using zero-forcing criterion 

Channel Coding 

Types of error control 

Block codes 

Error detection and correction 

Convolutional codes and the Viterbi algorithm 

Communication over Fading Channel 

Fading channels 

Characterizing mobile-radio propagation 

Signal Time-Spreading 

Mitigating the effects of fading 

Application of Viterbi equalizer in GSM system 

Application of Rake receiver in CDMA system 

1.6 Calendar 8 

Week 1: Overview of signals and spectra 
Week 2: Source coding 

Week 3: Receiver structure, demodulation and detection 

Week 4: Correlation receiver and matched filter. Detection of binary signals in AWGN 
Week 5: Optimal detection for general modulation. Coherent and non-coherent detection (I) 
Week 6: Coherent and non-coherent detection (II) 

Week 7: ISI in band-limited channels. Zero-ISI condition: the Nyquist criterion 
Week 8: Mid-term exam 
Week 9: Raised cosine filters. Partial response signals 



8 This content is available online at <http://cnx.org/content/ml5448/!. 3/>. 



4 CHAPTER 1. SYLLABUS 

Week 10: Channel equalization 

Week 11: Channel coding. Block codes 

Week 12: Convolutional codes 

Week 13: Viterbi algorithm 

Week 14: Fading channel. Characterizing mobile-radio propagation 

Week 15: Mitigating the effects of fading 

Week 16: Applications of Viterbi equalizer and Rake receiver in GSM and CDMA systems 

Week 17: Final exam 

1.7 Grading Procedures 9 

Homework/Participation/Exams: 

• Homework and Programming Assignments 

• Midterm Exam 

• Final Exam 

Homework and programming assignments will be given to test student's knowledge and understanding of the 
covered topics. Homework and programming assignments will be assigned frequently throughout the course 
and will be due in the time and place indicated on the assignment. Homework and programming assignments 
must be individually done by each student without collaboration with others. No late homework will be 
allowed. 

There will be in-class mid-term and final exams. The mid-term exam and the final exam will be time- 
limited to 60 minutes and 120 minutes, respectively. They will be closed book and closed notes. It is 
recommend that the students practice working problems from the book, example problems, and homework 
problems. 

Participation: Question and discussion in class are encouraged. Participation will be noted. 

Grades for this course will be based on the following weighting: 

• Homework and In-class Participation: 20% 

• Programming Assignments: 20% 

• Mid-term Exam: 20% 

• Final Exam: 40% 



9 This content is available online at <http://cnx.org/content/ml5437/!. 2/>. 



Chapter 2 

Chapter 1: Signals and Systems 



2.1 Signal Classifications and Properties 1 

2.1.1 Introduction 

This module will begin our study of signals and systems by laying out some of the fundamentals of signal clas- 
sification. It is essentially an introduction to the important definitions and properties that are fundamental 
to the discussion of signals and systems, with a brief discussion of each. 

2.1.2 Classifications of Signals 

2.1.2.1 Continuous- Time vs. Discrete-Time 

As the names suggest, this classification is determined by whether or not the time axis is discrete (countable) 
or continuous (Figure 2.1). A continuous-time signal will contain a value for all real numbers along the 
time axis. In contrast to this, a discrete-time signal 2 , often created by sampling a continuous signal, will 
only have values at equally spaced intervals along the time axis. 



A 



■> 



A 



this axis continuous 
or discrete 



Figure 2.1 



lr rhis content is available online at <http://cnx.Org/content/ml0057/2. 21/>. 
2 "Discrete-Time Signals" <http://cnx.org/content/m0009/latest/> 
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2.1.2.2 Analog vs. Digital 



The difference between analog and digital is similar to the difference between continuous-time and discrete- 
time. However, in this case the difference involves the values of the function. Analog corresponds to a 
continuous set of possible function values, while digital corresponds to a discrete set of possible function 
values. An common example of a digital signal is a binary sequence, where the values of the function can 
only be one or zero. 
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or discrete 
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Figure 2.2 



2.1.2.3 Periodic vs. Aperiodic 

Periodic signals 3 repeat with some period T, while aperiodic, or nonperiodic, signals do not (Figure 2.3). 
We can define a periodic function through the following mathematical expression, where t can be any number 
and T is a positive constant: 

f(t) = f(T + t) (2.1) 

The fundamental period of our function, / (£), is the smallest value of T that the still allows (2.1) to be 
true. 



"Continuous Time Periodic Signals" <http://cnx.org/content/ml0744/latest/> 



K T. * 

(a) 




(b) 



Figure 2.3: (a) A periodic signal with period To (b) An aperiodic signal 



2.1.2.4 Finite vs. Infinite Length 

As the name implies, signals can be characterized as to whether they have a finite or infinite length set of 
values. Most finite length signals are used when dealing with discrete-time signals or a given sequence of 
values. Mathematically speaking, / (t) is a finite-length signal if it is nonzero over a finite interval 

ti<f (t) < t 2 

where t\ > — oo and t^ < oo. An example can be seen in Figure 2.4. Similarly, an infinite-length signal, 
/ (£), is defined as nonzero over all real numbers: 

oo < f (t) < — oo 
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Figure 2.4: Finite-Length Signal. Note that it only has nonzero values on a set, finite interval. 
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2.1.2.5 Causal vs. Anticausal vs. Noncausal 

Causal signals are signals that are zero for all negative time, while anticausal are signals that are zero for 
all positive time. Noncausal signals are signals that have nonzero values in both positive and negative time 
(Figure 2.5). 




zero here 



(a) 




7ero here 



(b) 




(c) 



Figure 2.5: (a) A causal signal (b) An anticausal signal (c) A noncausal signal 



2.1.2.6 Even vs. Odd 

An even signal is any signal / such that f (t) = / '(—£). Even signals can be easily spotted as they 

are symmetric around the vertical axis. An odd signal, on the other hand, is a signal / such that 
/(*) = -/(-*) (Figure 2.6). 



A fe(t) 




(a) 




(b) 



Figure 2.6: (a) An even signal (b) An odd signal 



Using the definitions of even and odd signals, we can show that any signal can be written as a combination 
of an even and odd signal. That is, every signal has an odd-even decomposition. To demonstrate this, we 
have to look no further than a single equation. 



/ (*) = \ (/(*) + /(-*)) + \ (/(*)-/(-*)) 



(2.2) 



By multiplying and adding this expression out, it can be shown to be true. Also, it can be shown that 
f (t) + f (— t) fulfills the requirement of an even function, while f (t) — f (— t) fulfills the requirement of an 
odd function (Figure 2.7). 

Example 2.1 
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Figure 2.7: (a) The signal we will decompose using odd-even decomposition (b) Even part: e (t) 
\ (/ (*) + / (-*)) (c) Odd part: o (t) = \ (/ (t) - / (-«)) (d) Check: e (t) + o (t) = / (t) 
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2.1.2.7 Deterministic vs. Random 

A deterministic signal is a signal in which each value of the signal is fixed and can be determined by a 
mathematical expression, rule, or table. Because of this the future values of the signal can be calculated 
from past values with complete confidence. On the other hand, a random signal 4 has a lot of uncertainty 
about its behavior. The future values of a random signal cannot be accurately predicted and can usually 
only be guessed based on the averages 5 of sets of signals (Figure 2.8). 
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(b) 



Figure 2.8: (a) Deterministic Signal (b) Random Signal 



Example 2.2 

Consider the signal defined for all real t described by 



/(*) = { 



sin (2irt) ft t>l 







t< 1 



(2.3) 



This signal is continuous time, analog, aperiodic, infinite length, causal, neither even nor odd, and, 
by definition, deterministic. 



2.1.3 Signal Classifications Summary 

This module describes just some of the many ways in which signals can be classified. They can be continuous 
time or discrete time, analog or digital, periodic or aperiodic, finite or infinite, and deterministic or random. 
We can also divide them based on their causality and symmetry properties. There are other ways to classify 
signals, such as boundedness, handedness, and continuity, that are not discussed here but will be described 
in subsequent modules. 



4 "Introduction to Random Signals and Processes" <http://cnx.org/content/ml0649/latest/> 
5 "Random Processes: Mean and Variance" <http://cnx.org/content/ml0656/latest/> 
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2.2 System Classifications and Properties 6 

2.2.1 Introduction 

In this module some of the basic classifications of systems will be briefly introduced and the most important 
properties of these systems are explained. As can be seen, the properties of a system provide an easy way 
to distinguish one system from another. Understanding these basic differences between systems, and their 
properties, will be a fundamental concept used in all signal and system courses. Once a set of systems can be 
identified as sharing particular properties, one no longer has to reprove a certain characteristic of a system 
each time, but it can simply be known due to the the system classification. 

2.2.2 Classification of Systems 

2.2.2.1 Continuous vs. Discrete 

One of the most important distinctions to understand is the difference between discrete time and continuous 
time systems. A system in which the input signal and output signal both have continuous domains is said to 
be a continuous system. One in which the input signal and output signal both have discrete domains is said 
to be a continuous system. Of course, it is possible to conceive of signals that belong to neither category, 
such as systems in which sampling of a continuous time signal or reconstruction from a discrete time signal 
take place. 

2.2.2.2 Linear vs. Nonlinear 

A linear system is any system that obeys the properties of scaling (first order homogeneity) and superposition 
(additivity) further described below. A nonlinear system is any system that does not have at least one of 
these properties. 

To show that a system H obeys the scaling property is to show that 

H(kf(t)) = kH(f(t)) (2.4) 




R] — > y(t) = f(t) — > [R] — > ® — > y{t) 

I I 

K K 

Figure 2.9: A block diagram demonstrating the scaling property of linearity 



To demonstrate that a system H obeys the superposition property of linearity is to show that 

H (/i (t) + h (t)) = H {h (t)) + H (/ 2 (*)) (2.5) 



6 This content is available online at <http://cnx.Org/content/ml0084/2.21/> . 
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Figure 2.10: A block diagram demonstrating the superposition property of linearity 



It is possible to check a system for linearity in a single (though larger) step. To do this, simply combine 
the first two steps to get 



H (fci/i (t) + k 2 f 2 (t)) = k 2 H (A (t)) + k 2 H (/ 2 (£)) 



(2.6) 



2.2.2.3 Time Invariant vs. Time Varying 

A system is said to be time invariant if it commutes with the parameter shift operator defined by St (f (t)) 
f (t — T) for all T, which is to say 



H St — StH 



(2.7) 



for all real T. Intuitively, that means that for any input function that produces some output function, any 
time shift of that input function will produce an output function identical in every way except that it is 
shifted by the same amount. Any system that does not have this property is said to be time varying. 



f(t) 



*a±- 



T 

f(t-T) 



H 



V (t-T) 



f(l) 



0^^ 



y{t-T> 



yfl) 



Figure 2.11: This block diagram shows what the condition for time invariance. The output is the 
same whether the delay is put on the input or the output. 



2.2.2.4 Causal vs. Noncausal 

A causal system is one in which the output depends only on current or past inputs, but not future inputs. 
Similarly, an anticausal system is one in which the output depends only on current or future inputs, but not 
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past inputs. Finally, a noncausal system is one in which the output depends on both past and future inputs. 
All "realtime" systems must be causal, since they can not have future inputs available to them. 

One may think the idea of future inputs does not seem to make much physical sense; however, we have 
only been dealing with time as our dependent variable so far, which is not always the case. Imagine rather 
that we wanted to do image processing. Then the dependent variable might represent pixel positions to the 
left and right (the "future") of the current position on the image, and we would not necessarily have a causal 
system. 



(a) 




y ( t q ) is a function of only 
these values 



(b) 

Figure 2.12: (a) For a typical system to be causal... (b) ...the output at time to, y (to), can only 
depend on the portion of the input signal before to- 



2.2.2.5 Stable vs. Unstable 

There are several definitions of stability, but the one that will be used most frequently in this course will 
be bounded input, bounded output (BIBO) stability. In this context, a stable system is one in which the 
output is bounded if the input is also bounded. Similarly, an unstable system is one in which at least one 
bounded input produces an unbounded output. 

Representing this mathematically, a stable system must have the following property, where x (t) is the 
input and y (t) is the output. The output must satisfy the condition 



\y(t)\ <M y <oo 
whenever we have an input to the system that satisfies 

\x(t) | < M x < oo 



(2.8) 



(2.9) 
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M x and M y both represent a set of finite positive numbers and these relationships hold for all oft. Otherwise, 
the system is unstable. 

2.2.3 System Classifications Summary 

This module describes just some of the many ways in which systems can be classified. Systems can be 
continuous time, discrete time, or neither. They can be linear or nonlinear, time invariant or time varying, 
and stable or unstable. We can also divide them based on their causality properties. There are other ways 
to classify systems, such as use of memory, that are not discussed here but will be described in subsequent 
modules. 

2.3 The Fourier Series 7 

2.3.1 Theorems on the Fourier Series 

Four of the most important theorems in the theory of Fourier analysis are the inversion theorem, the con- 
volution theorem, the differentiation theorem, and Parseval's theorem [4]. All of these are based on the 
orthogonality of the basis function of the Fourier series and integral and all require knowledge of the conver- 
gence of the sums and integrals. The practical and theoretical use of Fourier analysis is greatly expanded 
if use is made of distributions or generalized functions [8][1]. Because energy is an important measure of a 
function in signal processing applications, the Hilbert space of L 2 functions is a proper setting for the basic 
theory and a geometric view can be especially useful [5] [4]. 

The following theorems and results concern the existence and convergence of the Fourier series and the 
discrete-time Fourier transform [7]. Details, discussions and proofs can be found in the cited references. 

• If / (x) has bounded variation in the interval (— 7r, 7r), the Fourier series corresponding to / (x) converges 
to the value / (x) at any point within the interval, at which the function is continuous; it converges 
to the value | [/ [x + 0) + / [x — 0)] at any such point at which the function is discontinuous. At the 
points 7r, — 7r it converges to the value \ [f (— n + 0) + / (ir — 0)]. [6] 

• If f (x) is of bounded variation in (— 7r,7r), the Fourier series converges to f (x), uniformly in any 
interval (a, b) in which / (x) is continuous, the continuity at a and b being on both sides. [6] 

• If / (x) is of bounded variation in (— 7r,7r), the Fourier series converges to \ [f [x + 0) + / (x — 0)], 
bounded throughout the interval (— 7r,7r). [6] 

• If / (x) is bounded and if it is continuous in its domain at every point, with the exception of a finite 
number of points at which it may have ordinary discontinuities, and if the domain may be divided into 
a finite number of parts, such that in any one of them the function is monotone; or, in other words, 
the function has only a finite number of maxima and minima in its domain, the Fourier series of / (x) 
converges to / (x) at points of continuity and to \ [f (x + 0) + / [x — 0)] at points of discontinuity. 
[6][3] 

• If / (x) is such that, when the arbitrarily small neighborhoods of a finite number of points in whose 
neighborhood \f (x) | has no upper bound have been excluded, / (x) becomes a function with bounded 
variation, then the Fourier series converges to the value \ [f (x + 0) + / (x — 0)], at every point in 
(— 7r,7r), except the points of infinite discontinuity of the function, provided the improper integral 
/_ / ( x ) dx exist, and is absolutely convergent. [6] 

• If f is of bounded variation, the Fourier series of f converges at every point x to the value 
[/ (x + 0) + / (x — 0)] /2. If f is, in addition, continuous at every point of an interval / = (a, 6), 
its Fourier series is uniformly convergent in /. [10] 

• If a (k) and b (k) are absolutely summable, the Fourier series converges uniformly to / (x) which is 
continuous. [7] 



7 This content is available online at <http://cnx.org/content/ml3873/!. l/>. 
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• If a (k) and b (k) are square summable, the Fourier series converges to / (x) where it is continuous, but 
not necessarily uniformly. [7] 

• Suppose that f (x) is periodic, of period X, is defined and bounded on [0,X] and that at least one 
of the following four conditions is satisfied: (i) / is piecewise monotonic on [0,X], (ii) / has a finite 
number of maxima and minima on [0,X] and a finite number of discontinuities on [0, X], (hi) / is of 
bounded variation on [0,X], (iv) / is piecewise smooth on [0,X]: then it will follow that the Fourier 
series coefficients may be defined through the defining integral, using proper Riemann integrals, and 
that the Fourier series converges to / (x) at a.a.x, to / (x) at each point of continuity of /, and to the 
value \ [f (x~) + / (x + )} at all x. [4] 

• For any 1 < p < oo and any / € C p (S 1 ), the partial sums 

S n = S n (f)= J2 Hk)e k (2.10) 

|fc|<n 

converge to /, uniformly as n — >• oo; in fact, \\S n — fW^ is bounded by a constant multiple of n -p+1 / 2 . 
[5] 

The Fourier series expansion results in transforming a periodic, continuous time function, x(t), to two 
discrete indexed frequency functions, a (k) and b (k) that are not periodic. 

2.4 The Fourier Transform 8 

2.4.1 The Fourier Transform 

Many practical problems in signal analysis involve either infinitely long or very long signals where the 
Fourier series is not appropriate. For these cases, the Fourier transform (FT) and its inverse (IFT) have 
been developed. This transform has been used with great success in virtually all quantitative areas of science 
and technology where the concept of frequency is important. While the Fourier series was used before Fourier 
worked on it, the Fourier transform seems to be his original idea. It can be derived as an extension of the 
Fourier series by letting the length increase to infinity or the Fourier transform can be independently defined 
and then the Fourier series shown to be a special case of it. The latter approach is the more general of the 
two, but the former is more intuitive [9] [2]. 

2.4.1.1 Definition of the Fourier Transform 

The Fourier transform (FT) of a real- valued (or complex) function of the real- variable t is defined by 

POO 

X{uj)= / x(t)e~ jujt dt (2.11) 



giving a complex valued function of the real variable uo representing frequency. The inverse Fourier transform 
(IFT) is given by 

1 r°° 

x(t) = — X (u) e JUJt du. (2.12) 

2tt J_ 00 

Because of the infinite limits on both integrals, the question of convergence is important. There are useful 
practical signals that do not have Fourier transforms if only classical functions are allowed because of problems 
with convergence. The use of delta functions (distributions) in both the time and frequency domains allows 
a much larger class of signals to be represented [9]. 



8 This content is available online at <http://cnx.org/content/ml3874/!. 2/>. 
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2.4.1.2 Examples of the Fourier Transform 

Deriving a few basic transforms and using the properties allows a large class of signals to be easily studied. 
Examples of modulation, sampling, and others will be given. 

• lfx(t) = S (t) then X(u) = l 

• lfx(t) = l then X (u) = 2ir5 (u) 

• If x (t) is an infinite sequence of delta functions spaced T apart, x (t) = ^2^ = _ OQ S (t — nT), its trans- 
form is also an infinite sequence of delta functions of weight 2n/T spaced 2n/T apart, X (u) = 
2nEZ-ooHu-27tk/T). 

• Other interesting and illustrative examples can be found in [9] [2]. 

Note the Fourier transform takes a function of continuous time into a function of continuous frequency, 
neither function being periodic. If "distribution" or "delta functions" are allowed, the Fourier transform of 
a periodic function will be a infinitely long string of delta functions with weights that are the Fourier series 
coefficients. 

2.5 Review of Probability and Random Variables 9 

The focus of this course is on digital communication, which involves transmission of information, in its 
most general sense, from source to destination using digital technology. Engineering such a system requires 
modeling both the information and the transmission media. Interestingly, modeling both digital or analog 
information and many physical media requires a probabilistic setting. In this chapter and in the next one we 
will review the theory of probability, model random signals, and characterize their behavior as they traverse 
through deterministic systems disturbed by noise and interference. In order to develop practical models for 
random phenomena we start with carrying out a random experiment. We then introduce definitions, rules, 
and axioms for modeling within the context of the experiment. The outcome of a random experiment is 
denoted by w. The sample space Q is the set of all possible outcomes of a random experiment. Such outcomes 
could be an abstract description in words. A scientific experiment should indeed be repeatable where each 
outcome could naturally have an associated probability of occurrence. This is defined formally as the ratio 
of the number of times the outcome occurs to the total number of times the experiment is repeated. 

2.5.1 Random Variables 

A random variable is the assignment of a real number to each outcome of a random experiment. 




.If 



Figure 2.13 



9 This content is available online at <http://cnx.Org/content/ml0224/2.16/> . 



18 



CHAPTER 2. CHAPTER 1: SIGNALS AND SYSTEMS 



Example 2.3 

Roll a dice. Outcomes {^1,^2,^3,^4,^5,^6} 
uji = i dots on the face of the dice. 
X{uji) = i 



2.5.2 Distributions 

Probability assignments on intervals a < X < b 

Definition 2.1: Cumulative distribution 

The cumulative distribution function of a random variable X is a function F x (I 

F x (b) = Pr[X <b] 

= Pr[{u e Q I X(cj) <b}} 



I) such that 

(2.13) 




A={<oeQ|X(co)<b} 

Figure 2.14 



Definition 2.2: Continuous Random Variable 

A random variable X is continuous if the cumulative distribution function can be written in an 
integral form, or 



x 



(b)= I fx(x) 

J — CO 



dx 



(2.14) 



and f x (x) is the probability density function (pdf) (e.g., F x (x) is differentiable and f x (x) 



d (F X (x))) 



dx 



Definition 2.3: Discrete Random Variable 

A random variable X is discrete if it only takes at most countably many points (i.e., Fx (•) is 
piecewise constant). The probability mass function (pmf) is defined as 



P x (x k ) = Pr[X = x k ] 



Fx (x k ) - limit F x (x) 

x(x^Xk) A (x<Xk) 



(2.15) 



Two random variables denned on an experiment have joint distribution 

Fx,„y (0,6) = Pr[X <a,Y<b] 

= Pr[{ioen\(X(io)<a) A (Y(to)<b)}} 



(2.16) 
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Figure 2.15 



Joint pdf can be obtained if they are jointly continuous 

/b pa 
/ f 
-oo J — oo 

d 2 F x ,,,Y (x,y) 



X,Y 



(x,y)dxdy 



(e.g., fx,Y (x,y) 



l ) 



Joint pmf if they are jointly discrete 

Px,y (x k ,yi)=Pr[X = x k ,Y = yi] 



(2.17) 



(2.18) 



Conditional density function 



fy\x (y\x) 



fx,Y (x,y) 



(2.19) 



fx (x) 

for all x with f x ( x ) > otherwise conditional density is not defined for those values of x with f x ( x ) = 
Two random variables are independent if 



fx,Y (x,y) = f x (x)f Y (y) 
for all x 6 M and y 6 M. For discrete random variables, 

Px,y (x k ,yi)=px (x k )p Y (yi) 
for all k and I. 

2.5.3 Moments 

Statistical quantities to represent some of the characteristics of a random variable. 

g{X) - E[g{X)\ 

f^° 9 ( x ) f x ( x ) dx if continuous 

Y,kd( x k)p x (x k ) if discrete 



(2.20) 
(2.21) 



(2.22) 
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• Mean 



• 



Second moment 



• Variance 



\ix =X (2.23) 

E [X 2 ] =X 2 (2.24) 

Var(X) = cr(X) 2 

= (X-fix) 2 (2.25) 

-Mx 2 



• Characteristic function 



$x (u) =e luX (2.26) 



for wGl, where i = V^~T 
• Correlation between two random variables 



R XY = XY* 

joo joo X y*f xy (x,y) dxdy if X and Y are jointly continuous (2.27) 

Efe T,i x kV\v x,y (x k ,yi) if X and Y are jointly discrete 

• Covariance 

C XY = Cov(X,F) 

= (X - »xHY - » Y )* ( 2 - 2 8) 

= RxY — ^X^y 

• Correlation coefficient 

ax cry 

Definition 2.4: Uncorrelated random variables 
Two random variables X and Y" are uncorrelated if pxy = 0. 

2.6 Introduction to Stochastic Processes 10 

2.6.1 Definitions, distributions, and stationarity 

Definition 2.5: Stochastic Process 

Given a sample space, a stochastic process is an indexed collection of random variables defined for 
each uo € ft. 

VM6R: ftH) (2.30) 

Example 2.4 

Received signal at an antenna as in Figure 2.16. 



°This content is available online at <http://cnx.Org/content/ml0235/2.15/> . 
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Voltage 



Sample Paths 



iji|i ^i i>iii g i i li l )i ^i ^li il iiii ( ililil 



u mm mtim 



mmmmmmmmmmmmfmm 



Figure 2.16 



For a given t, X t (co) is a random variable with a distribution 
First-order distribution 

F Xt (b) = Pr[X t <b] 

= Pr[{ue n | X t (cj) <b}} 

Definition 2.6: First-order stationary process 

If Fx t (6) is not a function of time then X t is called a first-order stationary process. 



Second-order distribution 



F Xtl ,x t2 (61,62) = Pr[X tl < b u X t2 < 6 2 ] 



for all t x e R, t 2 e R, 61 e R, 6 2 e : 

Nth-order distribution 



(2.31) 



(2.32) 



F Xtl ,x t2 ,...,x tN (61,62, • • • ,6iv) = Pr [X tl < 61, . . . ,X tN < b N ] 



(2.33) 



TVth-order stationary : A random process is stationary of order N if 

Fx tl ,x t2 ,...,x tN (6i,6 2 ,...,6at) = F Xtl+T ,x t2+T ,...,x tN+T (61, 6 2 , . . . ,b N ) 



(2.34) 



Strictly stationary : A process is strictly stationary if it is Nth order stationary for all TV. 

Example 2.5 

X t = cos(27t/ ^ + © (^)) where /o is the deterministic carrier frequency and 6 (u) : — ► R 
is a random variable defined over [— 7r,7r] and is assumed to be a uniform random variable; i.e., 
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, 7T- if 6 E [— 7T, 7rl 

fe(0)={ 2 * [ J 

otherwise 



F Xt (6) = Pr[X t <6] 

= Pr [cos (2tt/ ^ + B) < b) 

F Xt (b) = Pr [-7T < 2n f t + 6 < -arccos (6)] + Pr [arccos (6) < 2tt/o£ + 6 < tt] 

p( — arccos(6)) — 27ifot i jq r7r — 27rfot 



F Xt (b) = r-™ cc ° s f-^ ot ±dd+f 

^t v ' J ( — 7r) — 27rjo£ 27r Ja 

= (27r — 2arccos (6)) ^ 



t 27r ^^ ' Jarccos(6) — 27r/o^ 27r 



fx t (x) = £ (1- ^arccos (x)) 



if Id < 1 



i 



Try/1 — x : 

otherwise 



This process is stationary of order 1. 



Plots of Cosines with different Phases and the same frequency 



VWWM 



20 40 60 80 100 120 140 160 180 200 



T ^^L I 1 Al 1 - 




20 40 60 80 100 120 140 160 180 200 



T J^L 1 - 




20 40 60 80 100 120 140 160 180 200 



T ^^t T - 




20 40 60 80 100 120 140 160 180 200 



(2.35) 

(2.36) 
(2.37) 

(2.38) 



Figure 2.17 



The second order stationarity can be determined by first considering conditional densities and 
the joint density. Recall that 

X t = cos (27rf t + B) (2.39) 

Then the relevant step is to find 



Pr[X t2 <b 2 \X tl = Xl ] 



(2.40) 
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Note that 



(X tl = x\ = cos (27rfot + 6)) =4> (0 = arccos (x\) — 2irfot) 

X t2 = cos (27r/o^2 + arccos (#i) — 27r/o^i) 
= cos (27r/o (^2 — ^i) + arccos (#i)) 



(2.41) 
(2.42) 



tPr[X ti <b 2 |X t| =x,] 



] ■• 



cos(2-nt(t 2 - 1[ ) + COS X,) 
Figure 2.18 



b, 



Fx t2 ,x tl (62,61) = / * /x tl (xi)Pr[X ia < 62 I X tl = an]dx 1 

«/ — OO 



(2.43) 



Note that this is only a function of £ 2 — ^i- 

Example 2.6 

Every T seconds, a fair coin is tossed. If heads, then X t = 1 for nT < £ < (n + 1) T. If tails, then 
X t = -1 for nT<t< (n + 1) T. 

X t t Sample function 



2T 3T 4T 



5T 



Figure 2.19 



Px t fa) = 

for all t6l. X t is stationary of order 1. 
Second order probability mass function 



x 



\ if 

| if x 



(2.44) 



Px tl x t2 (x u x 2 ) =Px t2 \x tl (x 2 \x 1 )p Xtl fai) 



(2.45) 
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The conditional pmf 

Px t2 \x tl (x 2 \ Xl )={ ^ (2.46) 

I 1 it X 2 = Xi 

when nT < t x < (n + 1) T and nT < t 2 < (n + 1) T for some n. 

Px t2 \x tl (x 2 \ Xl ) =Px t2 (x 2 ) (2.47) 

for all x\ and for all x 2 when nT < t\ < (n + 1) T and ttiT < £ 2 < (ra + 1) T with n ^ m 

if x 2 ^ xifor nT < * 1? t 2 < (n + 1) T 
Vx t2 x H O2, ^i) = ^ px tl fai) if ^2 = ^lfor nT < ti, t 2 < (n + 1) T (2.48) 

p Xtl (>i) Px t2 fa) if n ^ rafor (nT < *i < (n + 1) T) A (ml < t 2 < (m + 1) T 



2.7 Second-order Description of Stochastic Processes 11 

2.7.1 Second-order description 

Practical and incomplete statistics 

Definition 2.7: Mean 

The mean function of a random process X t is defined as the expected value of X t for all t's. 

I*x t - E[X t ] 

f_ OQ xfx t (x) dx if continuous (2.49) 

T,kL-oc x kP x t (xk) if discrete 

Definition 2.8: Autocorrelation 

The autocorrelation function of the random process X t is defined as 

Rx (h,ti) = E [X t2 X tl \ 

I-oo I-oo Xc 2~xif x t2 ,x tl (x 2 ,xi)dx ldx 2 if continuous (2.50) 

J2T=-ooJ2Z-oo x i^kP x t2 ,x tl (xi,x k ) if discrete 

Rule 2.1: 

If X t is second-order stationary, then Rx {t 2l t\) only depends on t 2 — t\. 
Proof: 



Rxfati) = E[X t2 X tl ] 

= S-oof-oo^xifxtvX^ (x 2 ,x 1 )dx2dxl 



(2.51) 



n This content is available online at <http://cnx.org/content/ml0236/2.13/>. 
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= Rx(t 2 -t u 0) 



If Rx (t2,ti) depends on t 2 — t\ only, then we will represent the autocorrelation with only one variable 
r = t 2 - t\ 

R x (r) = R x (t 2 - tl ) (253) 

= Rx (t 2j ti) 

Properties 

i. Rx (o) > o 



2. R x (r) = R x (-r) 

3. \R X (r) | < Rx (0) 

Example 2.7 

X t = cos (27rfot + B (a;)) and B is uniformly distributed between and 2ir. The mean function 

H X (t) = E[X t ] 

= E [cos (27r/ £ + 9)] 

= / 27r cos(27r/ i + ^)i^ 

= 



(2.54) 



The autocorrelation function 

R x (t + r,t) = E[X t+T Xt] 

= E [cos (2tt/ (t + r) + B) cos (2tt/o£ + 6)] 

= 1/2E [cos (27r/ r)] + 1/2E [cos (2tt/ (2t + r) + 29)] (2.55) 

= l/2cos (27r/ r) + 1/2 J Q 27r cos (2tt/ (2t + r) + 26>) ^d0 

= l/2cos (2tt/ot) 

Not a function of £ since the second term in the right hand side of the equality in (2.55) is zero. 

Example 2.8 

Toss a fair coin every T seconds. Since X t is a discrete valued random process, the statistical 
characteristics can be captured by the pmf and the mean function is written as 

li x (t) = E[X t ] 

= 1/2 x -1 + 1/2 x 1 (2.56) 

= 

Rx(t 2 M) = Y,kkT,u x kXiPx t2 ,x tl (x k ,xi) 

= 1 x 1 x 1/2 - 1 x -1 x 1/2 (2.57) 

= 1 
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when nT < t x < (n + 1) T and nT < t 2 < (n + 1) T 

#x (fe, *i) = 1 x 1 x 1/4 - 1 x -1 x 1/4 - 1 x 1 x 1/4 + 1 x -1 x 1/4 
= 

when nT < t x < (n + 1) T and mT < t 2 < (m + 1) T with n/m 

1 if (nT< t x < (n + l)T) A (nT < t 2 < (n + 1) T) 



(2.58) 



^x (h,ti) 



otherwise 



(2.59) 



A function of ti and t 2 . 

Definition 2.9: Wide Sense Stationary 

A process is said to be wide sense stationary if \±x is constant and Rx (t 2 ,ti) is only a function of 

Rule 2.2: 

If X t is strictly stationary, then it is wide sense stationary. The converse is not necessarily true. 

Definition 2.10: Autocovariance 

Autocovariance of a random process is defined as 



Cx (^2,^1) 



E 



(X t2 - fix (h)) X tl -fJL X (tl) 



Rx (h,ti) - fl X (t 2 ) fix (ti) 



(2.60) 



The variance of X t is Var (X t ) = Cx (t, t) 

Two processes defined on one experiment (Figure 2.20). 



X t 



* Linear System 



Yi 



Figure 2.20 



Definition 2.11: Crosscorrelation 

The crosscorrelation function of a pair of random processes is defined as 



Rxy (t2,ti) 



E [X t2 Y tl ] 

J-00 J-00 x yf*t 2 ,Y tl (x,y)dxdy 



C X y (t2,h) = R X y (h.h) - fi x (t 2 ) fiY (h) 



(2.61) 
(2.62) 



Definition 2.12: Jointly Wide Sense Stationary 

The random processes X t and Y t are said to be jointly wide sense stationary if Rxy (£2^1) is a 
function of t 2 — t\ only and fix (t) and fly (t) are constant. 
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2.8 Gaussian Processes 12 

2.8.1 Gaussian Random Processes 

Definition 2.13: Gaussian process 

A process with mean \±x (t) and covariance function Cx (^2^1) is said to be a Gaussian process 
if any X = (X tl , X t2 , . . . , X tN ) formed by any sampling of the process is a Gaussian random 
vector, that is, 

1 -(|(x- / ix) T s x - 1 (^-/ix)) 



fx 0) = w~ 

(2tt) 2 (det£ x )' 



(2.63) 



for all xGl n where 



/ 



fJ-x 



t*x(ti) 



\ 



and 



\ Mx (t N ) J 
( Cxituh) ... C x (t u t N ) \ 



\ Cx(tN,tl) ••• Cx (tN,tN) J 

. The complete statistical properties of X t can be obtained from the second-order statistics. 
Properties 

1. If a Gaussian process is WSS, then it is strictly stationary 

2. If two Gaussian processes are uncorrelated, then they are also statistically independent. 

3. Any linear processing of a Gaussian process results in a Gaussian process. 

Example 2.9 

X and Y are Gaussian and zero mean and independent. Z = X + Y is also Gaussian. 



(j) X (u) 



AuX 



(2.64) 



for all w e 



<Pz(u) 



e iu(X+Y) 
e -(^) e -(^? 



therefore Z is also Gaussian. 



(2.65) 



12 This content is available online at <http://cnx.org/content/ml0238/2.7/>. 
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2.9 White and Coloured Processes 13 

2.9.1 White Noise 

If we have a zero-mean Wide Sense Stationary process X, it is a White Noise Process if its ACF is a 
delta function at r = 0, i.e. it is of the form: 



where Px is a constant. 

The PSD of X is then given by 



rxx(r)=P x S(r) (2.66) 



S x (w) = J P X S(r)e-^ r Ur 

= P x e- (iw0) (2.67) 

= Px 



Hence X is white, since it contains equal power at all frequencies, as in white light. 

Px is the PSD of X at all frequencies. 
But: 

Power of X = ^ J^ S x M dw 



(2.68) 



oo 



so the White Noise Process is unrealizable in practice, because of its infinite bandwidth. 

However, it is very useful as a conceptual entity and as an approximation to 'nearly white' processes 
which have finite bandwidth, but which are 'white' over all frequencies of practical interest. For 'nearly 
white' processes, rxx (j) is a narrow pulse of non-zero width, and Sx (u) is flat from zero up to some 
relatively high cutoff frequency and then decays to zero above that. 

2.9.2 Strict Whiteness and i.i.d. Processes 

Usually the above concept of whiteness is sufficient, but a much stronger definition is as follows: 

Pick a set of times {ti, £2, • • • , tx} to sample X (t). 

If, for any choice of {£1,^2, • • • ,^iv} with TV finite, the random variables X (£1), X (£2), • • • X (tx) are 
jointly independent, i.e. their joint pdf is given by 

TV 

fx( tl ),x(t 2 ), ... x(t N ) (x!,X2,...,x N ) = Y[fx(u) (xi) (2.69) 

and the marginal pdfs are identical, i.e. 

fx(t ± ) = fx(t 2 ) 

(2.70) 

= fx(t N ) 

= fx 

then the process is termed Independent and Identically Distributed (i.i.d). 

If, in addition, fx is a pdf with zero mean, we have a Strictly White Noise Process. 
An i.i.d. process is 'white' because the variables X (U) and X (tj) are jointly independent, even when 
separated by an infinitesimally small interval between ti and tj. 



13 This content is available online at <http://cnx.Org/content/mlll05/2. 4/>. 
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2.9.3 Additive White Gaussian Noise (AWGN) 

In many systems the concept of Additive White Gaussian Noise (AWGN) is used. This simply means 
a process which has a Gaussian pdf, a white PSD, and is linearly added to whatever signal we are analysing. 

Note that although 'white' and Gaussian' often go together, this is not necessary (especially for 'nearly 
white' processes). 

E.g. a very high speed random bit stream has an ACF which is approximately a delta function, and 
hence is a nearly white process, but its pdf is clearly not Gaussian - it is a pair of delta functions at + (V) 
and — V, the two voltage levels of the bit stream. 

Conversely a nearly white Gaussian process which has been passed through a lowpass filter (see next 
section) will still have a Gaussian pdf (as it is a summation of Gaussians) but will no longer be white. 

2.9.4 Coloured Processes 

A random process whose PSD is not white or nearly white, is often known as a coloured noise process. 

We may obtain coloured noise Y (t) with PSD Sy (^) simply by passing white (or nearly white) noise 
X (t) with PSD Px through a filter with frequency response TL (cj), such that from this equation 14 from our 
discussion of Spectral Properties of Random Signals. 

S Y {w) = S x (to)(\H(u)\) 2 
= Px(\H(u)\) 2 

Hence if we design the filter such that 



|WMI= \r^r (2 - 72) 

then Y (t) will have the required coloured PSD. 

For this to work, Sy (u) need only be constant (white) over the passband of the filter, so a nearly white 
process which satisfies this criterion is quite satisfactory and realizable. 

Using this equation 15 from our discussion of Spectral Properties of Random Signals and (2.66), the ACF 
of the coloured noise is given by 

ryy (r) = rxx(r)*/i(-r)*/i(r) 

= P x 5(T)*h(-T)*h(T) (2.73) 

= P x h(-T)*h(r) 

where h (r) is the impulse response of the filter. 

This Figure 16 from previous discussion shows two examples of coloured noise, although the upper wave- 
form is more 'nearly white' than the lower one, as can be seen in part c of this figure 17 from previous 
discussion in which the upper PSD is flatter than the lower PSD. In these cases, the coloured waveforms 
were produced by passing uncorrelated random noise samples (white up to half the sampling frequency) 
through half-sine filters (as in this equation 18 from our discussion of Random Signals) of length T5 = 10 and 
50 samples respectively. 



14n Spectral Properties of Random Signals", (11) <http://cnx.Org/content/mlll04/latest/#eq27> 
15n Spectral Properties of Random Signals", (9) <http://cnx.Org/content/mlll04/latest/#eq25> 
16n Spectral Properties of Random Signals", Figure 1 <http://cnx.Org/content/mlll04/latest/#figurel> 

17 "Spectral Properties of Random Signals", Figure 1(c) <http://cnx.Org/content/mlll04/latest/#figurelc> 

18ht 



"Random Signals", (1) <http://cnx.Org/content/ml0989/latest/#eq9> 
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2.10 Transmission of Stationary Process Through a Linear Filter 19 

Integration 



/ 

J a 



Z(w)= / X t (w) dt 



Linear Processing 



Differentiation 



Y t 



F 

J — ( 



h(t,r)X T dr 



1 dt K tJ 



Properties 

1. Z=J*X t (Lo)dt=J b a vx(t)dt 

2. Z 2 =\ h a X t2 dt 2 J o 6 X^dt 1= / o 6 J a 6 R x (t 2 ,h) dt ldt 2 



(2.74) 



(2.75) 



(2.76) 



X, 



Linear System 

h(t,T) 



Y, 



Figure 2.21 



= JZo h (*' T ) Vx (t) dr 
If X t is wide sense stationary and the linear system is time invariant 

Vy (*) = /!^ h(t-r) fixdr 

= Hy 



(2.77) 



(2.78) 



R Y x(t2,h) = Y t2 X tl 

= J^hih-^XrdTXZ 



19 This content is available online at <http://cnx.Org/content/ml0237/2.10/> . 



(2.79) 



where r' = r — t\ . 



RYx(h,tl) = n o fe(*2-*l-T , )i2 X (T')dT/ 

= h*R x (t 2 -h) 

Ry(t2,ti) = YtJT, 

= YtJ^hitur^dr 

= J- co h(t 1 ,T)R Y x (t 2 ,r)dT 

= n o M*l-T)flyx(*2-T)dT 

= Ryfo-h) 
= h*Rvx(t2,ti) 
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(2.80) 



(2.81) 



(2.82) 



where r' = t 2 — r and h (t) = h (— r) for all r G R. YJ is WSS if X t is WSS and the linear system is 
time-invariant. 
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Figure 2.22 



Example 2.10 

X t is a wide sense stationary process with fix = 0, and Rx (r) = ^?-5(t). Consider the random 
process going through a filter with impulse response h(t) = e~^ at ^u(t). The output process is 
denoted by Y t . fiy (t) = for all t. 



Ry(t) 



^f^h(a)h(a-r)da 

Nn e" (a|r|) 



(2.83) 



2 2a 

X t is called a white process. Y t is a Markov process. 

Definition 2.14: Power Spectral Density 

The power spectral density function of a wide sense stationary (WSS) process X t is defined to be 
the Fourier transform of the autocorrelation function of X t . 



/oo 
Rx(T)e~^^dT 
-CO 



(2.84) 



if X t is WSS with autocorrelation function Rx (r). 
Properties 
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1- Sx (f) = Sx (— /) since i?x is even and real. 

2. Var (X t ) = ffc (0) = f^ S x (/) df 

3. $x (/) is real and nonnegative Sx (/) > for all /. 



If y t = /^ h(t-r) X T dr then 



5y(/) = f(i?y(r)) 

= T (h* ft *Rx (r)) 

= H(f)H(f)S x (f) 

= (\H(f)\fS x (f) 



since ff (/) = /^ ft (*) e-^^dt = H (f) 

Example 2.11 

X t is a white process and h (t) = e~^ at ^u (t). 



a + i27r/ 

_2 

-4tt 2 / 2 



(2.85) 



g(/)= , * 1 ( 2 - 86 ) 



5y (/) = T^rHhrr* (2.87) 



Chapter 3 

Chapter 2: Source Coding 



3.1 Information Theory and Coding 1 

In the previous chapters, we considered the problem of digital transmission over different channels. Infor- 
mation sources are not often digital, and in fact, many sources are analog. Although many channels are also 
analog, it is still more efficient to convert analog sources into digital data and transmit over analog channels 
using digital transmission techniques. There are two reasons why digital transmission could be more efficient 
and more reliable than analog transmission: 

1. Analog sources could be compressed to digital form efficiently. 

2. Digital data can be transmitted over noisy channels reliably. 

There are several key questions that need to be addressed: 

1. How can one model information? 

2. How can one quantify information? 

3. If information can be measured, does its information quantity relate to how much it can be compressed? 

4. Is it possible to determine if a particular channel can handle transmission of a source with a particular 
information quantity? 



Source 



I Channel I ^ Destination 



Figure 3.1 



Example 3.1 

The information content of the following sentences: "Hello, hello, hello." and "There is an exam 
today." are not the same. Clearly the second one carries more information. The first one can be 
compressed to "Hello" without much loss of information. 

In other modules, we will quantify information and find efficient representation of information (Entropy (Sec- 
tion 3.2)). We will also quantify how much (Section 6.1) information can be transmitted through channels, 
reliably. Channel coding (Section 6.5) can be used to reduce information rate and increase reliability. 



lr rhis content is available online at <http://cnx.Org/content/ml0162/2. 10/>. 
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3.2 Entropy 2 

Information sources take very different forms. Since the information is not known to the destination, it is 
then best modeled as a random process, discrete-time or continuous time. 
Here are a few examples: 

• Digital data source (e.g., a text) can be modeled as a discrete-time and discrete valued random process 
Xi, X2, . . ., where X{ E {A, B,C,D,E, . . .} with a particular px ± (x), px 2 ( x )? • • •? an d a specific 
Px ± x 2 , Px 2 x 3 , • • •, and px 1 x 2 x 3 , Px 2 x 3 x 4 , • • •, etc. 

• Video signals can be modeled as a continuous time random process. The power spectral density is 
bandlimited to around 5 MHz (the value depends on the standards used to raster the frames of image). 

• Audio signals can be modeled as a continuous-time random process. It has been demonstrated that 
the power spectral density of speech signals is bandlimited between 300 Hz and 3400 Hz. For example, 
the speech signal can be modeled as a Gaussian process with the shown (Figure 3.2) power spectral 
density over a small observation period. 



3,(0 



n\ 



nx 



300 3400 



-►f 



Figure 3.2 



These analog information signals are bandlimited. Therefore, if sampled faster than the Nyquist rate, 
they can be reconstructed from their sample values. 

Example 3.2 

A speech signal with bandwidth of 3100 Hz can be sampled at the rate of 6.2 kHz. If the samples 
are quantized with a 8 level quantizer then the speech signal can be represented with a binary 
sequence with the rate of 



6.2xl0 3 log 2 8 = 18600 ^^Tef 5 



(3.1) 
18.6- kbits 



sec 



2 This content is available online at <http://cnx.Org/content/ml0164/2.16/> . 
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Speech signal 




001101 10101 1 1100 



Figure 3.3 



The sampled real values can be quantized to create a discrete-time discrete-valued random 
process. Since any bandlimited analog information signal can be converted to a sequence of discrete 
random variables, we will continue the discussion only for discrete random variables. 

Example 3.3 

The random variable x takes the value of with probability 0.9 and the value of 1 with probability 
0.1. The statement that x = 1 carries more information than the statement that x = 0. The reason 
is that x is expected to be 0, therefore, knowing that x = 1 is more surprising news!! An intuitive 
definition of information measure should be larger when the probability is small. 

Example 3.4 

The information content in the statement about the temperature and pollution level on July 15th 
in Chicago should be the sum of the information that July 15th in Chicago was hot and highly 
polluted since pollution and temperature could be independent. 



/ (hot, high) = / (hot) + / (high) 



(3.2) 



An intuitive and meaningful measure of information should have the following properties: 

1. Self information should decrease with increasing probability. 

2. Self information of two independent events should be their sum. 

3. Self information should be a continuous function of the probability. 

The only function satisfying the above conditions is the -log of the probability. 

Definition 3.1: Entropy 

1. The entropy (average self information) of a discrete random variable X is a function of its 
probability mass function and is defined as 

N 



H(X) = -^2px (xi)\ogpx (xi) 



(3.3) 



i=l 



where TV is the number of possible values of X and p x (xi ) = Pr [X = xj\. If log is base 2 then 
the unit of entropy is bits. Entropy is a measure of uncertainty in a random variable and a measure 
of information it can reveal. 
2. A more basic explanation of entropy is provided in another module 3 . 

Example 3.5 

If a source produces binary information {0, 1} with probabilities p and 1 — p. The entropy of the 
source is 

H (X) = (- (pIo g2 p)) - (1 - p) log 2 (1 - p) (3.4) 



"Entropy" <http://cnx.org/content/m0070/latest/> 
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If p = then H(X) = 0,ifp=l then H (X) = 0, if p = 1/2 then H (X) = 1 bits. The source has 
its largest entropy if p = 1/2 and the source provides no new information if p = or p = 1. 




Figure 3.4 



Example 3.6 

An analog source is modeled as a continuous-time random process with power spectral density 
bandlimited to the band between and 4000 Hz. The signal is sampled at the Nyquist rate. The 
sequence of random variables, as a result of sampling, are assumed to be independent. The samples 
are quantized to 5 levels {—2,-1,0,1,2}. The probability of the samples taking the quantized 
values are { | , |, | , ^ , ^ } , respectively. The entropy of the random variables are 



H(X) = (-Qlog4))-ilo g a-|log 2 |->g 2 ^->g 2 ^ 
= ^log 2 2 + |log 2 4 + |log 2 8 + ^log 2 16 + ^log 2 16 

2 ~ 2 ~ 8 ~ 8 
_ 15 bits 
8 sample 

There are 8000 samples per second. Therefore, the source produces 8000 x ^ = 
information. 

Definition 3.2: Joint Entropy 

The joint entropy of two discrete random variables (X, Y) is defined by 



(3.5) 



15000^ of 



H(X,Y) = ~^2^2px,y (x i ,y j )\ogpx,Y (xi,Vj) 



(3.6) 
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The joint entropy for a random vector X = (Xi, X 2 , . . . , X n ) is defined as 

H(X) = - y^ ^2 ■■■ ^2 P x (xi,x 2 ,...,x n )logp x (xi,x 2 ,...,x n ) 



(3.7) 



x lx± x 2x2 



Definition 3.3: Conditional Entropy 

The conditional entropy of the random variable X given the random variable Y is defined by 



H(X\Y) = -^2^2px,y (xi,yj)logp x \Y (xi\yj) 



(3.8) 
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It is easy to show that 

H{X) = H{X 1 ) + H{X 2 \X 1 )- 
and 



H (X n \X\X 2 . . . X n _i) 



H(X,Y) = H(Y)+H(X\Y) 
= H(X)+H(Y\X) 
If X\, X 2j . . ., X n are mutually independent it is easy to show that 

n 

H(X)=Y,H{X i ) 



(3.9) 
(3.10) 

(3.11) 



Definition 3.4: Entropy Rate 

The entropy rate of a stationary discrete-time random process is defined by 



H = limit H (X n \X 1 X 2 ...X n ) 



The limit exists and is equal to 



H = limit -H(X u X 2 ,...,X n ) 



(3.12) 



(3.13) 



The entropy rate is a measure of the uncertainty of information content per output symbol of the 
source. 

Entropy is closely tied to source coding (Section 3.3). The extent to which a source can be compressed 
is related to its entropy. In 1948, Claude E. Shannon introduced a theorem which related the entropy to the 
number of bits per second required to represent a source without much loss. 



3.3 Source Coding 4 

As mentioned earlier, how much a source can be compressed should be related to its entropy (Section 3.2). 
In 1948, Claude E. Shannon introduced three theorems and developed very rigorous mathematics for digital 
communications. In one of the three theorems, Shannon relates entropy to the minimum number of bits per 
second required to represent a source without much loss (or distortion). 

Consider a source that is modeled by a discrete-time and discrete- valued random process Xi, X 2 , • • ., 
X n , . . . where X{ G {ai, a 2 , . . . , a at} and define pxi (%i = a>j) = Pj for j = 1, 2, . . . , TV, where it is assumed 
that Xl, X 2j . . . X n are mutually independent and identically distributed. 

Consider a sequence of length n 



X 



x 2 



\Xn ) 



(3.14) 



The symbol a\ can occur with probability p\. Therefore, in a sequence of length n, on the average, a\ will 
appear wp\ times with high probabilities if n is very large. 
Therefore, 

P(X = x) =px ± (xi)px 2 (x 2 )...px n (x n ) (3.15) 



4 This content is available online at <http://cnx.Org/content/ml0175/2.10/> . 
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N 



p(x= x )^ P1 n ^p 2 np > . . . PN npN = n^ 



(3.16) 



where pi = P (Xj = a^) for all j and for all i. 
A typical sequence X may look like 



X 



(3.17) 



/ a 2 \ 

a N 
a 2 
a 5 

a x 

a N 
V a 6 ) 

where a^ appears npi times with large probability. This is referred to as a typical sequence. The probability 
of X being a typical sequence is 



p(x = x)^u?=ip- 



2 n ^ill Pi l °S 2 Pi 

2 -{nH{X)) 



(3.18) 



where H (X) is the entropy of the random variables Xi, X 2 ,. . ., X n . 

For large n, almost all the output sequences of length n of the source are equally probably with 
probability ~ 2 - ( ni ^ x )). These are typical sequences. The probability of nontypical sequences are neg- 
ligible. There are N n different sequences of length n with alphabet of size N. The probability of typical 
sequences is almost 1. 

# of typical seq. 

j- 2 -("^» = 1 (3.19) 

k=l 
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Figure 3.5 



Example 3.7 

Consider a source with alphabet {A,B,C,D} with probabilities { |, |, |, |}. Assume Xl, X 2 ,. . ., 
X 8 is an independent and identically distributed sequence with Xi G {A, B, C, D} with the above 
probabilities. 

H(X) = (-(|log 2 |))-Jlog 2 J-|log 2 |-|log 2 | 



1,2. 

2^4 

4+4+6 



(3.20) 



14 



The number of typical sequences of length 8 



^8x^ 



,14 



(3.21) 



The number of nontypical sequences 4 8 - 2 14 = 2 16 - 2 14 = 2 14 (4 - 1) = 3 x 2 14 
Examples of typical sequences include those with A appearing 8x^=4 times, B appearing 
8x^=2 times, etc. {A,D,B,B,A,A,C,A}, {A,A,A,A,C,D,B,B} and much more. 

Examples of nontypical sequences of length 8: {D,D,B,C,C,A,B,D}, {C,C,C,C,C,B,C,C} and 
much more. Indeed, these definitions and arguments are valid when n is very large. The probability 
of a source output to be in the set of typical sequences is 1 when n — > oo. The probability of a 
source output to be in the set of nontypical sequences approaches as n — > oo. 

The essence of source coding or data compression is that as n — > oo, nontypical sequences never appear as 
the output of the source. Therefore, one only needs to be able to represent typical sequences as binary codes 
and ignore nontypical sequences. Since there are only 2 nH ^ typical sequences of length n, it takes nH (X) 
bits to represent them on the average. On the average it takes H (X) bits per source output to represent a 
simple source that produces independent and identically distributed outputs. 

Theorem 3.1: Shannon's Source-Coding 

A source that produced independent and identically distributed random variables with entropy H 
can be encoded with arbitrarily small error probability at any rate R in bits per source output if 
R > H. Conversely, if R < H, the error probability will be bounded away from zero, independent 
of the complexity of coder and decoder. 
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The source coding theorem proves existence of source coding techniques that achieve rates close to the 
entropy but does not provide any algorithms or ways to construct such codes. 

If the source is not i.i.d. (independent and identically distributed), but it is stationary with mem- 
ory, then a similar theorem applies with the entropy H (X) replaced with the entropy rate H = 
limit H(X n \X 1 X 2 ...X n _ 1 ) 

n— >oo 

In the case of a source with memory, the more the source produces outputs the more one knows about 
the source and the more one can compress. 

Example 3.8 

The English language has 26 letters, with space it becomes an alphabet of size 27. If modeled as 
a memoryless source (no dependency between letters in a word) then the entropy is H (X) = 4.03 
bits/letter. 

If the dependency between letters in a text is captured in a model the entropy rate can be 
derived to be H = 1.3 bits/letter. Note that a non-information theoretic representation of a text 
may require 5 bits/letter since 2 5 is the closest power of 2 to 27. Shannon's results indicate that 
there may be a compression algorithm with the rate of 1.3 bits/letter. 

Although Shannon's results are not constructive, there are a number of source coding algorithms for discrete 
time discrete valued sources that come close to Shannon's bound. One such algorithm is the Huffman source 
coding algorithm (Section 3.4). Another is the Lempel and Ziv algorithm. 

Huffman codes and Lempel and Ziv apply to compression problems where the source produces discrete 
time and discrete valued outputs. For cases where the source is analog there are powerful compression 
algorithms that specify all the steps from sampling, quantizations, and binary representation. These are 
referred to as waveform coders. JPEG, MPEG, vocoders are a few examples for image, video, and voice, 
respectively. 

3.4 Huffman Coding 5 

One particular source coding (Section 3.3) algorithm is the Huffman encoding algorithm. It is a source 
coding algorithm which approaches, and sometimes achieves, Shannon's bound for source compression. A 
brief discussion of the algorithm is also given in another module 6 . 

3.4.1 Huffman encoding algorithm 

1. Sort source outputs in decreasing order of their probabilities 

2. Merge the two least-probable outputs into a single output whose probability is the sum of the corre- 
sponding probabilities. 

3. If the number of remaining outputs is more than 2, then go to step 1. 

4. Arbitrarily assign and 1 as codewords for the two remaining outputs. 

5. If an output is the result of the merger of two outputs in a preceding step, append the current codeword 
with a and a 1 to obtain the codeword the the preceding outputs and repeat step 5. If no output is 
preceded by another output in a preceding step, then stop. 

Example 3.9 

X e {A,B,C,D} with probabilities { §,£,§,§} 



5 This content is available online at <http://cnx.Org/content/ml0176/2.10/> . 

6 "Compression and the Huffman Code" <http://cnx.org/content/m0092/latest/> 
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Figure 3.6 



Average length = ^1 + |2+|3+|3= ^. As you may recall, the entropy of the source was 
also H (X) = ^. In this case, the Huffman code achieves the lower bound of ^ out* ut • 
In general, we can define average code length as 



£=^Px (x)£(x) (3.22) 

xex 

where X is the set of possible values of x. 
It is not very hard to show that 

H(X) >~£> H(X) + 1 (3.23) 

For compressing single source output at a time, Huffman codes provide nearly optimum code lengths. 
The drawbacks of Huffman coding 

1. Codes are variable length. 

2. The algorithm requires the knowledge of the probabilities, p x (x) for all x € X. 

Another powerful source coder that does not have the above shortcomings is Lempel and Ziv. 
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Chapter 4 

Chapter 3: Communication over AWGN 
Channels 

4.1 Data Transmission and Reception 1 

We will develop the idea of data transmission by first considering simple channels. In additional modules, 
we will consider more practical channels; baseband channels with bandwidth constraints and passband 
channels. 

Simple additive white Gaussian channels 



Input 



fr 



Output 



Channel 



Figure 4.1: X t carries data, N t is a white Gaussian random process. 



The concept of using different types of modulation for transmission of data is introduced in the module 
Signalling (Section 4.2). The problem of demodulation and detection of signals is discussed in Demodulation 
and Detection (Section 4.4). 

4.2 Signalling 2 

Example 4.1 

Data symbols are "1" or "0" and data rate is ^ Hertz. 



lr rhis content is available online at <http://cnx.Org/content/ml0115/2. 9/>. 
2 This content is available online at <http://cnx.org/content/ml01 16/2.1 1/>. 



43 



44 



CHAPTER 4. CHAPTER 3: COMMUNICATION OVER AWGN CHANNELS 
Pulse amplitude modulation (PAM) 



Data 




Modulated 
signal Xj 



Figure 4.2 



Pulse position modulation 



Figure 4.3 



Example 4.2: Example 

Data symbols are "1" or "0" and the data rate is |? Hertz. 
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Figure 4.4 



This strategy is an alternative to PAM with half the period, ^. 
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Figure 4.5 



Relevant measures are energy of modulated signals 



£ m =Vme{l,2,...,M}: f / 



s m 2 (t) dt 



and how different they are in terms of inner products. 



<s m ,s n >= I s m (t)s n (i)dt 
h 

for m G {1, 2, . . . , M} and n G {1, 2, . . . , M }. 
Definition 4.1: antipodal 

Signals si (£) and s 2 (t) are antipodal if Vt,t G [0, T] : (s 2 (£) = — si (£)) 
Definition 4.2: orthogonal 

Signals si (£), s% (£),. . ., 5m (£) are orthogonal if < s m , s n >= for m ^ n. 
Definition 4.3: biorthogonal 

Signals s\ (£), s% (£),. . ., 5m (£) are biorthogonal if si (£),.. ., 5m (£) are orthogonal and s m (t) 
— 5M +m (£) for some ra G {l,2, . . . , 4^}. 



(4.1) 



(4.2) 
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It is quite intuitive to expect that the smaller (the more negative) the inner products, < <s m ,s n > for all 
m ^ n, the better the signal set. 

Definition 4.4: Simplex signals 

Let {51 (t) , s 2 (i) , . . . , sm (t)} be a set of orthogonal signals with equal energy. The signals si (£),. . ., 
s~m (t) are simplex signals if 

1 M 

5mW = 5 m (t)-— ^S fe (t) (4.3) 



fc=l 



If the energy of orthogonal signals is denoted by 

Vm,m€{l,2,...,M}: (S a = / s m 2 (/) dZ ) (4.4) 



o 2 . 



then the energy of simplex signals 



Es = [ 1 - jj ) E s (4.5) 



and 

Vm^n: I < s~ m , s~ n >= —— E~ s J (4.6) 

It is conjectured that among all possible M-ary signals with equal energy, the simplex signal set results 
in the smallest probability of error when used to transmit information through an additive white Gaussian 
noise channel. 

The geometric representation of signals (Section 4.3) can provide a compact description of signals and 
can simplify performance analysis of communication systems using the signals. 

Once signals have been modulated, the receiver must detect and demodulate (Section 4.4) the signals 
despite interference and noise and decide which of the set of possible transmitted signals was sent. 

4.3 Geometric Representation of Modulation Signals 3 

Geometric representation of signals can provide a compact characterization of signals and can simplify 
analysis of their performance as modulation signals. 

Orthonormal bases are essential in geometry. Let {51 (t) , s 2 (t) , . . . , sm (t)} be a set of signals. 

Define ^1 (t) = ^ where E 1 = / Q T s 1 2 (t) dt. 



Define s 2 i =< 82,^1 >= f s 2 (t) fa (t)dt and fa (t) = -^= (s 2 (i) - S2ifa) where E 2 

J (s 2 (t) - s 21 fa (t)) 2 dt 
In general 

# (t) = -L s k (t) - J2 s^j (*) ] (4-7) 

^E k V 

T / h — ~\ \ ^ 

where E k = J Q ys k (t) - £\ =1 s^fa (i)j dt. 

The process continues until all of the M signals are exhausted. The results are N orthogonal signals 
with unit energy, {^1 (t) ,fa(t),..., ipw (t)} where TV < M. If the signals {si (t) , . . . , sm (t)} are linearly 
independent, then TV = M. 



3 This content is available online at <http://cnx.Org/content/ml0035/2.13/> . 



48 



CHAPTER 4. CHAPTER 3: COMMUNICATION OVER AWGN CHANNELS 



The M signals can be represented as 



TV 



. (t) = ^ 5 mn ^ n (t) 



(4.8) 



^N 



with m € {1, 2, ... , M} where s mn =< s m , ip n > and E m = Xln=i s mn 2 - The signals can be represented by 
/ s m i \ 



Sm2 



\ SmN ) 

Example 4.3 




Figure 4.6 



V'i(t) 



*i(*) 



sii = AVf 



S21 



(aVt) 



V»2(i) = {82 (*) - 8 2 lll>l (*)) -4 



= o 



(4.9) 
(4.10) 
(4.11) 

(4.12) 
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Figure 4.7 



Dimension of the signal set is 1 with E\ = Sn 2 and E 2 = S21 2 . 
Example 4.4 
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Figure 4.8 



i(t) 



iffi 



where E s = j s m 2 (i) dt 
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Vm n:\d- 



mn \Sm ^n 



N 






(4.13) 



is the Euclidean distance between signals. 

Example 4.5 

Set of 4 equal energy biorthogonal signals, si (t) = s(t), s 2 (t) = s 1 - (t), S3 (£) = —s (t), S4 (t) 



The orthonormal basis ^1 (t) = 4|f=, ^2 (*) = ^7= where E s = J Q s m 2 (t) dt 



s ± (t) 



Si 



, 52 = ( / __ I , 53 = [ J , 84 = . The four signals can 



j \ve s j y j y -ve s 

be geometrically represented using the 4- vector of projection coefficients si, 52, 53, and 84 as a set 
of constellation points. 
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Signal constellation 
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Figure 4.9 



021 = 


\S2 -Si 


= 


V2E~s 


^12 


= d 23 




= <^34 




= du 


dl3 = 


|«1 --§3 


= 


2x/e: 



dl3 — <^24 



(4.14) 

(4.15) 

(4.16) 
(4.17) 



Minimum distance cL 



/2E a 



4.4 Demodulation and Detection 4 

Consider the problem where signal set, {si,«2? • • • , s m}, for t G [0, T] is used to transmit log 2 M bits. The 
modulated signal X t could be {si, 52, ... , %} during the interval < t < T. 



4 This content is available online at <http://cnx.Org/content/ml0054/2.14/> . 
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data 



s,(t) 
s 2 (t) 

Of 



x, 



>@- 



N 



Figure 4.10: r t = X t + N t = s m (t) + N t iorO<t<T for some m € {1, 2, ... , M}. 



Recall s m (£) = 5Z n =i s mn^n (t) for me {1,2,..., M} the signals are decomposed into a set of orthonor- 
mal signals, perfectly. 

Noise process can also be decomposed 



N 



Nt = Y^ Vnlpn (t) + Nt 



(4.18) 



n=l 



where r] n = J Q N t ip n (i) dt is the projection onto the n th basis signal, N t is the left over noise. 

The problem of demodulation and detection is to observe r t for < t < T and decide which one of 
the M signals were transmitted. Demodulation is covered here (Section 4.5). A discussion about detection 
can be found here (Section 4.6). 

4.5 Demodulation 5 

4.5.1 Demodulation 

Convert the continuous time received signal into a vector without loss of information (or performance) . 

r t = s m (t) + N t (4.19) 



n 



N 



n 



^2 S ™n^n (t) + ^ Vn^n (t) + N t 



N 



r t = y2 (S mn + Tin) ll) n (t) + N t 



n=l 



N 



n = Yl r ^n (t) + Nt 



(4.20) 
(4.21) 
(4.22) 



Proposition 4.1: 

The noise projection coefficients r^ n 's are zero mean, Gaussian random variables and are mutually 
independent if N t is a white Gaussian process. 
Proof: 



/^(n) 



E 
E 



Jo 'N t tl> n (t)dt 



(4.23) 



5 This content is available online at <http://cnx.Org/content/ml0141/2.13/> . 
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/^(n) = / E[N t ]i/> n (t)dt 
= 



E [rjkVn 



E 



fo N t tl> k (t)dtf* N t ,t/> k (t')dtt 



So So N t N t 4 k (t)^ n (t')dtdtt 



' [mfh] = [ [ R N (t- 1') fa {t) fa 

JO Jo 



dtdt / 



E [VkVn] 



No 



T r T 



JO 



S(t- t') fa (t) fa (t')dtdt I 



E [rjkVn] 



= 


%-fo1l>k(t)1/> n (t)dt 


= 


2 u kn 




{ ^ if k = n 




I if k^n 



(4.24) 

(4.25) 

(4.26) 
(4.27) 

(4.28) 



rj k 's are uncorrelated and since they are Gaussian they are also independent. Therefore, r\ k 

"_ No., 



Gaussian (0, ^) and R n (fc, n) = ^Skn 



Proposition 4.2: 

The r n 's, the projection of the received signal r t onto the orthonormal bases fa (t)'s, are indepen- 
dent from the residual noise process N t . 

The residual noise N t is irrelevant to the decision process on r t . 
Proof: 

Recall r n = s mn + r] nj given «s m (£) was transmitted. Therefore, 



\i r (n) = E [s mn + 7? n ] 

Smn. 



(4.29) 



Var(r n ) = Var(^ n ) 



2 



The correlation between r n and 7V t 



A^ 



£ 



at 



N t -y^m^k (t) s rt 



Vn 



fc=l 



E 



N t r^ 



E 



E 



N t r, 



N 



N t -J2^k(t) 



k=l 



N 



Smn + E [f] k T] n } ~ ^ E [r] k T] n } fa (t) 



k=l 



I 



N t / N t <fa{t')dtt 



N 



J2 -W- S kn^k (t) 



k=l 



E 



Ntr;, 



Nn Nn 

-±5{t-t')fa{t')dtt-^fa{t) 



(4.30) 



(4.31) 

(4.32) 

(4.33) 
(4.34) 
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E 



N t r- r , 



f^ n (t) - f^ n (t) 







(4.35) 



Since both N t and r n are Gaussian then N t and r n are also independent. 



The conjecture is to ignore N t and extract information from 



T2 



. Knowing the vector r 



\ r N ) 
we can reconstruct the relevant part of random process r t for < t < T 



n = s m (t) + N t 



^N 



En=irn^n(t) + N t 



(4.36) 







r K 




N, 

\r r 



Figure 4.11 
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*i(t) ' 



*i(t) ' 



4(t) ' — 



Detector 



-► m 



Figure 4.12 



Once the received signal has been converted to a vector, the correct transmitted signal must be detected 
based upon observations of the input vector. Detection is covered elsewhere (Section 4.6). 



4.6 Detection by Correlation 6 
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Figure 4.13 



















6 This content is available online at <http://cnx.org/content/ml0091/2.15/>. 
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4.6.1 Detection 

Decide which s m (t) from the set of {51 (£),..., s m (£)} signals was transmitted based on observing r 

( n \ 



r 2 



, the vector composed of demodulated (Section 4.5) received signal, that is, the vector of projection 



of the received signal onto the TV bases. 



m= arg max Pr [s m (t) was transmitted | r was observed] 

Km<M 



(4.37) 



Note that 



Pr [s m I r] = Pr [s m (t) was transmitted | r was observed] 



fr\s m Pr[Sm] 



(4.38) 



If Pr [s m was transmitted] = -^, that is information symbols are equally likely to be transmitted, then 



arg max Pr[s m \ r] = arg max / r i s 

Km<M Km<M ' 



(4.39) 



Since r (£) = s m (t) + 7V t for < t < T and for some m = {1, 2, . . . , M} then r = s m + 77 where 77 = 



and ?7 n 's are Gaussian and independent. 



V m J 



-T N (r 



Vrn,r n el: /, 



r|s„ 



(2.f; 



(4.40) 



m = arq max f r \ s 

= arg max In (/ r i s „ 



= ar 9 x ^ M (- (f ln Mfo))) - 7^ En=l ( r n - S m,n) 

= arg rain £^ =1 ( r n ~ Sm,nf 

l<m<M 

where D (r, s m ) is the I2 distance between vectors r and s m defined as D (r, s m ) = J2 n =i ( r ™ — s m,n) 



(4.41) 



m = arg min D (r, s m ) 

l<m<M 

= arg min (\\ r ||) 2 -2 < (r,s m ) > +(|| s m ||) 2 

Km<M 



(4.42) 



where || r || is the l 2 norm of vector r defined as || r ||= y E n =i ( r ^) 



A /v^AT 



m= arg max 2 < (r, s m ) > — (|| s^ 

1<771<M 



(4.43) 



This type of receiver system is known as a correlation (or correlator-type) receiver. Examples of the use 
of such a system are found here (Section 4.7). Another type of receiver involves linear, time-invariant filters 
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and is known as a matched filter (Section 4.8) receiver. An analysis of the performance of a correlator-type 
receiver using antipodal and orthogonal binary signals can be found in Performance Analysis 7 . 

4.7 Examples of Correlation Detection 8 

The implementation and theory of correlator-type receivers can be found in Detection (Section 4.6). 
Example 4.6 



^(t) 



— > 



Figure 4.14 



m= 2 since D (r, si) > D (r, s 2 ) or (|| s 1 ||) 2 = (|| s 2 \\f and < r, s 2 »< r, s 1 >. 



r ' r pJj_^_ 



^g>- 



-illS'l' 2 



N 

E 

n I 



4- 



N 

E 

n-1 



■i 11*2 II 2 



-llNi 2 



3m„ • - 2 Hp 

J r ' ^ \ t |— Q> 



Choose 
the 
max 



m 



Figure 4.15 



7 " Performance Analysis" <http://cnx.org/content/ml0106/latest/> 

8 This content is available online at <http://cnx.org/content/ml0149/2.10/>. 
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Example 4.7 

Data symbols "0" or "1" with equal probability. Modulator s\ (t) 
s 2 (t) = -s (t) for < t < T. 



s (t) for < t < T and 




Figure 4.16 



M*) 






, 5ii = AVT, and s 2 i = - (aVt\ 

Vra, m = {1, 2} : (r t = s m (t) + JV t ) 



(4.44) 



*i(t) 

r t »® s 



Figure 4.17 



or 



r\ 



n = A Vt + 771 



m 



7?i is Gaussian with zero mean and variance 



2 ' 



(4.45) 
(4.46) 
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IV <■ 

\ 



Figure 4.18 



m= argmax lAVTri,— (Ay/Trij k since A\[T > and Pr [si] 
decision rule decides. 

5i (t) was transmitted if r\ > 
82 (t) was transmitted if T\ < 
An alternate demodulator: 

(r t = s m (t) + N t ) ^(r = s m + r]) 



Pr [si] then the MAP 



(4.47) 



4.8 Matched Filters 9 

Signal to Noise Ratio (SNR) at the output of the demodulator is a measure of the quality of the demod- 
ulator. 

SNR = Signal enCTgy (4.48) 

noise energy 

In the correlator described earlier, E s = (|s m |) and a^ 2 = ^-. Is it possible to design a demodulator 
based on linear time-invariant filters with maximum signal-to-noise ratio? 



9 This content is available online at <http://cnx.Org/content/ml0101/2.14/> . 
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Figure 4.19 



If s m (£) is the transmitted signal, then the output of the k th filter is given as 



Vk(t) = f^rrhkit-rfdr 

= JZo( s m(r) + N T )h k (t-r)dr 



Sampling the output at time T yields 

/CO />CO 

s m (r) /i fe (T-T)dr+ N T h k (T - r) dr 

-CO «/ — CO 



The noise contribution: 



^ 



/CO 
N T h k (T - t) dr 
-co 



(4.49) 



The expected value of the noise component is 

E[u k ] = E^^NrhkiT-^dr 
= 

The variance of the noise component is the second moment since the mean is zero and is given as 



(4.50) 
(4.51) 

(4.52) 



°{vkY 



E 
E 



Vk 



J^ N T h k (T - t) dr He N T >h k (T - r)dr 



ZW] = I-oofZo^iT-T^htiT-^htiT-T'WTdT' 

No 
2 



N °f^(\h k (T-T)\fdT 



(4.53) 



(4.54) 
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Signal Energy can be written as 

Sm (r) h k (T - r) dr) (4.55) 



and the signal-to-noise ratio (SNR) as 



(!Z^(r)h k (T-r)dry 



SNR = \. — *-*— (4.56) 

f/_;wr-T)i) 2 <ir 

The signal-to-noise ratio, can be maximized considering the well-known Cauchy-Schwarz Inequality 

/CO \ 2 «oO r>OQ> 

9i {x)~g^{x)dx < / (\ 9l (x) \) 2 dx / (\g 2 (x) \fdx (4.57) 

-co / J — CO J— CO 

with equality when g\ (x) = a^ 2 (%)• Applying the inequality directly yields an upper bound on SNR 



2 r °° 



(l-oo s m(^)hk(T-r)drj 

\ roo ( . h (T , h2 / <JT i\sm(r)\fdr (4.58) 

with equality Vr : ( h^ (T — r) = as m (r) J . Therefore, the filter to examine signal m should be 

Matched Filter 

Vr:(fcSr(T) = * m (T-T)) (4.59) 

The constant factor is not relevant when one considers the signal to noise ratio. The maximum SNR is 
unchanged when both the numerator and denominator are scaled. 

— / (\s m (r)\fdr=-^ (4.60) 

iV J_ 00 iV 

Examples involving matched filter receivers can be found here (Section 4.9). An analysis in the frequency 
domain is contained in Matched Filters in the Frequency Domain 10 . 

Another type of receiver system is the correlation (Section 4.6) receiver. A performance analysis of both 
matched filters and correlator-type receivers can be found in Performance Analysis 11 . 

4.9 Examples with Matched Filters 12 

The theory and rationale behind matched filter receivers can be found in Matched Filters (Section 4.8). 
Example 4.8 



10 "Matched Filters in the Frequency Domain" <http://cnx.org/content/ml0151/latest/> 

11 "Performance Analysis" <http://cnx.org/content/ml0106/latest/> 

12 This content is available online at <http://cnx.Org/content/ml0150/2.10/> . 
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s 2 (t) 
t 



h*(t) 





Figure 4.20 



5i (i) = t for < t < T 
s 2 (t) = -tfoiO<t<T 
fti (i) = T - t for < t < T 
h 2 (t) = -T + tfoY0<t<T 



*.<*) 




Vt, < t < IT : ( si (t) 
51 (t) 



s\ (j) h\ (t — t) dr 



J*T(T-t + T)dT 

\(T-t)T*\i + \T% 

si (T) = -^ 



Compared to the correlator-type demodulation 



Slit) 



(4.61) 



(4.62) 



(4.63) 



(4.64) 
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sn = I si (r) ^i (r) dr 
Jo 

/o 5 i ( r ) ^i (r) dr = ^= / * rrdr 

— 1 1/3 



(4.65) 
(4.66) 







T 

CDincLalof miLput 'VBji 



Figure 4.22 



Example 4.9 

Assume binary data is transmitted at the rate of ^ Hertz. 

=> (b = 1) => (si (t) = s (t)) for < t < T 

1 => (b = -1) => (s 2 (t) = -s (t)) for < t < T 



x t = J2 m (* - * T ) 



(4.67) 
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s,(t) 




Figure 4.23 



4.10 Performance Analysis of Binary Orthogonal Signals with 
Correlation 13 

Orthogonal signals with equally likely bits, r t = s m (t) + N t for < t < T, m = 1, m = 2, and < «i, «2 >= 0. 

4.10.1 Correlation (correlator- type) receiver 

(r = (ri,r 2 ) = s m + r]) (see Figure 4.24) 



n 



13 This content is available online at <http://cnx.Org/content/ml0154/2.l l/>. 
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Jv* ^(t) 



Figure 4.24 



Decide s\ (t) was transmitted if r\ >r 2 . 



Pr = 



Pr [rh y^ m] 
Prlb^b 



(4.68) 



P e = l/2Pr [r e R 2 | Si (t) transmitted] + l/2Pr [r G Ri \ s 2 (i) transmitted] = (4.69) 
WfibffrMt) (r)drldr2 + 1/2 J R Jf rMt) (r)rfrlrfr2 



-(\n-VEs\y -(k 2 |) 2 -(kll) 2 - -(|r 2 -^7|)" 

1/2 L f -r==e ^o -i e ^o rfr ldr 2+1/2 L f -r==e ^o -i e ^o rfr ldr 2 



Alternatively, if si (t) is transmitted we decide on the wrong signal if r 2 > r\ or r] 2 > tji + V^s or when 



P e = 1/2 /^- ^=e 2 ^o ^77 / + l/2Pr [n > r 2 | s 2 (*) transmitted] 

= « (vt) " 



(4.70) 



Note that the distance between si and s 2 is d\ 2 = y/2E s . The average bit error probability P e = Q ( /fen- ) 

as we had for the antipodal case 14 . Note also that the bit-error probability is the same as for the matched 
filter (Section 4.11) receiver. 



14 "Performance Analysis of Antipodal Binary signals with Correlation" <http://cnx.org/content/ml0152/latest/> 



65 



4.11 Performance Analysis of Orthogonal Binary Signals with 
Matched Filters 16 



r t =>\Y 



Yi(T) 

Y 2 (T) 



If si (t) is transmitted 



Yi{T) = JZ s 1 (r)hT(T-r)dr + i, 1 (T) 
= jr co s 1 (T)st(T)dT + v 1 (T) 

= E s + v x {T) 



(4.71) 



(4.72) 



Y 2 (T) = C 00 s 1 (T)s* 2 (T)dT + v 2 (T) 
= MT) 

If s 2 (t) is transmitted, Y x (T) = v x (T) and Y 2 (T) = E s + v 2 (T). 



(4.73) 



Y 2 (T) 




Decision Boundary 
^ > 



Y,{T) 



Figure 4.25 



HO 



Y 



E s 




V2 



(4.74) 



HI 



Y 




E s 



V2 



15 This content is available online at <http://cnx.org/content/ml0155/2.9/>. 



(4.75) 
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where v\ and v 2 are independent are Gaussian with zero mean and variance ^f-E s . The analysis is identical 
to the correlator example (Section 4.10). 

N 



Pe=Q 



(4.76) 



Note that the maximum likelihood detector decides based on comparing Y\ and Y 2 . If Y\ > Y 2 then si 
was sent; otherwise s 2 was transmitted. For a similar analysis for binary antipodal signals, refer here 16 . See 
Figure 4.26 or Figure 4.27. 
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"H s 2 (T-t) 
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Figure 4.26 



S,(T-t)-s 2 (T-t) 



Y -Y 2 



< £ o 



Figure 4.27 



4.12 Carrier Phase Modulation 1 



4.12.1 Phase Shift Keying (PSK) 

Information is impressed on the phase of the carrier. As data changes from symbol period to symbol period, 
the phase shifts. 



Vra, m E {1, 2, . . . , M} : [s m (t) = AP T (t) cos 27rf c t 



27r(ra- 1) 
M 



(4.77) 



Example 4.10 

Binary s\ (t) or s 2 (t) 



16 "Performance Analysis of Binary Antipodal Signals with Matched Filters" <http://cnx.org/content/ml0153/latest/> 
17 This content is available online at <http://cnx.Org/content/ml0128/2.10/> . 
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4.12.2 Representing the Signals 

An orthonormal basis to represent the signals is 

^i (t) 
fait) 



-1 



^ 



The signal 



S m (*) = AP T (t) cos 2tt/ c * 



AP T (t) cos (2tt/ c £) 
AP T (t) sin (27r/ c t) 

27r(m- 1) 



M 



S m (*) = Acos ( 27r( ™ 1} ) Pt (t) cos (2tt/ c £) - Asin ( ^^ ^ ) P T (t) sin (2ir f c t) 



M 



M 



The signal energy 



E s = ^A 2 P T 2 {t)cos 2 ^f c t+ 2 -^^-)dt 



M 



A 2 T 1 . 2 /- T / „ 4tt(to-1)\ , ,4 2 T 
E s = — + -A 2 J cos ( 47r/ c i + — ^ ^ ) di 



(4.78) 
(4.79) 

(4.80) 
(4.81) 

(4.82) 
(4.83) 



V>i (*) = \l t Pt (*) cos ( 27r / ct ) 



V> 2 (*) 



P T (i)sin(27r/ c t) 



(Note that in the above equation, the integral in the last step before the aproximation is very small.) 
Therefore, 

(4.84) 
(4.85) 

(4.86) 

(4.87) 
(4.88) 

(4.89) 



P T (t) sin (27r/ c i) 

Per ( 2iz(m-l)\ 

^C 08 { M ) 
r^r • /27r(ra-l)\ 



In general, 



and ip\ (t) 



Vra, m E {1, 2, . . . , M} : ( s m (t) = AP T (t) cos 2tt/ c £ 



27r(m- 1) 
M 



^lW = \ ~P T (t) COS (27Tf c t) 



M*) 
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4.12.3 Demodulation and Detection 

r t = s m (t) + N u for somera <E {1, 2, . . . , M} (4.90) 

We must note that due to phase offset of the oscillator at the transmitter, phase jitter or phase changes 
occur because of propagation delay. 

r t = AP T (t) cos (W c i + 2 * {m M X) +A+N t (4.91) 

For binary PSK, the modulation is antipodal, and the optimum receiver in AWGN has average bit-error 
probability 

= q(V5) 

The receiver where 

r t = ± (AP T (t) cos (2tt/ c * + (/>)) + N t (4.93) 

The statistics 



W 



*N Q T 



= Q I cos L- <t> J Ayff o J 
which is not a function of a and depends strongly on phase accuracy. 



(4.92) 



(4.94) 



ri = J r^acos 27rf c t+ (j) dt 

= ±[Jq aAcos (2irf c t + 0) cos ( 2irf c t+ (j) j dt j + J Q T acos ( 2tt/ c £+ J iV t dt 

r ^ = ± [°Y I cos ( 47r /c^ + 0+ I + cos I (j)- (j) j dt j + 771 (4.95) 

ri=± f^-Tcos ((/>-(/) J ) + / ±f ^-cosU7r/ c t + 0+0 j j dt + r/i ± f ^cos j 0- <\> J j +r/i (4.96) 

where 771 = a J N t cos u c t-\- (j) \ dt is zero mean Gaussian with variance c^ a ^° T . 
Therefore, 

/ 2^cosf d-d I \ 

P e = Q 



(4.97) 



P e = glcosU-0iy^l (4.98) 

The above result implies that the amplitude of the local oscillator in the correlator structure does not play 
a role in the performance of the correlation receiver. However, the accuracy of the phase does indeed play a 
major role. This point can be seen in the following example: 
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Example 4.11 

x t , = -lMcos (- (2n f c 1?) + 2tt f c r) (4.99) 

x t = -lMcos (27rf c t - {Infer' - 2tt/ c t + 0')) (4.100) 

Local oscillator should match to phase 0. 

4.13 Carrier Frequency Modulation 18 

4.13.1 Frequency Shift Keying (FSK) 

The data is impressed upon the carrier frequency. Therefore, the M different signals are 

s m (t) = AP T (t) cos {2ixf c t + 2tt (m - 1) A (/) t + m ) (4.101) 

for me {1,2, ...,M} 

The M different signals have M different carrier frequencies with possibly different phase angles since 
the generators of these carrier signals may be different. The carriers are 

h = fc (4.102) 

/ 2 = /c + A(/) 

f M = f c -MA(f) 
Thus, the M signals may be designed to be orthogonal to each other. 

< s m , 5 n >= J T A 2 cos (2tt/ c * + 2tt (ra - 1) A (/) t + 9 m ) cos (2?r/ c t + 2tt (n - 1) A (/) t(4:l8») dt 
^ J T cos (47r/ c t + 2tt (n + m - 2) A (/) t + 9 m + 6 n ) dt + 

^ f T m*(<?7r(rn - r>\ A ( f\ t ± ft - ft \ M - & sin(47r/ c T+27r(n+m-2)A(/)r+g m +g w )-sin(g m +g w ) ■ 
2 Jo CUb V Z7r l m n ) ^KJ ) L ^r u m Vn) at — 2 47r/ c +27r(n+m-2)A(/) " t " 

A^ f sin(27r(m-ri)A(/)r+fl m -fl TO ) _ sin(fl m -fl n ) \ 
2 y 27r(m-n)A(/) 2ir(m-n)A(f) J 

If 2f c T+(n + m — 2) A (/) T is an integer, and if (m — n) A (/) T is also an integer, then < 5 m , *S n >= 
if A (/) T is an integer, then < s m , s n >^ when / c is much larger than ^. 
In case Vm, m = : (0 m = 0) 

< s m , 5 n >~ ^sinc (2 (m - n) A (/) T) (4.104) 

Therefore, the frequency spacing could be as small as A (/) = -^ since sine (x) = if # = ± (1) or ± (2). 
If the signals are designed to be orthogonal then the average probability of error for binary FSK with 
optimum receiver is 

[U+2010] / I E*\ 

P e = QU^\ (4.105) 



8 This content is available online at <http://cnx.Org/content/ml0163/2.10/> . 
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in AWGN. 

Note that sine (x) takes its minimum value not at x = ± (1) but at ±(1.4) and the minimum value is 

T 



-0.216. Therefore if A (/) = ^ then 



[u + 2oio] / l.216E s \ 

P e = Q( V /^-^J (4.106) 

which is a gain of 10 x logl.216 ~ 0.85<i# over orthogonal FSK. 

4.14 Differential Phase Shift Keying 19 

The phase lock loop provides estimates of the phase of the incoming modulated signal. A phase ambiguity 
of exactly tt is a common occurance in many phase lock loop (PLL) implementations. 

Therefore it is possible that, 0= + tt without the knowledge of the receiver. Even if there is no noise, 

if b = 1 then b= and if b = then b= 1. 

In the presence of noise, an incorrect decision due to noise may results in a correct final desicion (in 
binary case, when there is tt phase ambiguity with the probability: 



S-I-fl^J (4.I07) 

Consider a stream of bits a n G {0, 1} and BPSK modulated signal 

J2 -l an AP T (t - nT) cos (2irf c t ± 0) (4.108) 

n 

In differential PSK, the transmitted bits are first encoded b n = a n b n -\ with initial symbol (e.g. 6q) 
chosen without loss of generality to be either or 1. 
Transmitted DPSK signals 

J2 ~l bn AP T (t - nT) cos (2nf c t + 9) (4.109) 



The decoder can be constructed as 



b n -\®b n = b n -i®a n ®b n -i 

= 0©a n (4.110) 



If two consecutive bits are detected correctly, if b n = b n and b n -i — &n-i then 

ft n = bn © &n— 1 

= 6 n © 6 n _i 

= « n © 6 n _i © 6 n _i 

= An 



(4.111) 



9 This content is available online at <http://cnx.org/content/ml0156/2.7/>. 
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if b n = b n © 1 and b n -i = b n -i © 1. That is, two consecutive bits are detected incorrectly. Then, 



br 


.0 bn-l 


b n 


© 1 © b n -l © 1 


b n 


© 6 n -l © 1 © 1 


b n 


ebn-1 ©0 


b n 


®fen-l 


a n 





(4.112) 



If b n — b n © 1 and 6 n _i = 6 n -i? that is, one of two consecutive bits is detected in error. In this case there 
will be an error and the probability of that error for DPSK is 



Pr 


®n T 1 ®n 








J ^ 




"xv XV 
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Pr 


bn = b n , bn-l 7^ b n -i 


+ Pr 


bn 7^ ^n? bn-l 


= bn-l 


2Q 


VV N o ) 


>-«(v / * 


)] 


= 2Q(v^) 





(4.113) 



This approximation holds if Q is small. 
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Chapter 5 

Chapter 4: Communication over 
Band-limitted AWGN Channel 

5.1 Digital Transmission over Baseband Channels 1 

Until this point, we have considered data transmissions over simple additive Gaussian channels that are not 
time or band limited. In this module we will consider channels that do have bandwidth constraints, and are 
limited to frequency range around zero (DC). The channel is best modified as g (t) is the impulse response 
of the baseband channel. 

Consider modulated signals x t = s m (t) for < t < T for some m G {1,2,..., M} . The channel output 
is then 

n = JZoXr9(t-T)dT + N t 

= I-oo S rn(T)g(t-T)dT + N t 

The signal contribution in the frequency domain is 

V/:(SU/) = S™(/)G(/)) (5.2) 

The optimum matched filter should match to the filtered signal: 

V/ : (#r (/) = S m U)GU)e { - i)2 * ft ) (5-3) 

This filter is indeed optimum (i.e., it maximizes signal-to-noise ratio); however, it requires knowledge of 
the channel impulse response. The signal energy is changed to 



E- K 



I" (\Sm(f)\) 2 df (5.4) 

J — CO 



The band limited nature of the channel and the stream of time limited modulated signal create aliasing 
which is referred to as intersymbol interference. We will investigate ISI for a general PAM signaling. 

5.2 Introduction to ISI 2 

A typical baseband digital system is described in Figure 1(a). At the transmitter, the modulated pulses are 
filtered to comply with some bandwidth constraint. These pulses are distorted by the reactances of the cable 



lr rhis content is available online at <http://cnx.Org/content/ml0056/2. 12/>. 
2 This content is available online at <http://cnx.org/content/ml5519/!. 5/>. 
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or by fading in the wireless systems. Figure 1(b) illustrates a convenient model, lumping all the filtering into 
one overall equivalent system transfer function. 
H(f) = H t (f).H c (f).H r (f) 
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Figure 5.1: Intersymbol interference in the detection process, (a) Typical baseband digital system, (b) 
Equivalent model 



Due to the effects of system filtering, the received pulses can overlap one another as shown in Figure 
1(b). Such interference is termed InterSymbol Interfernce (ISI). Even in the absence of noise, the effects of 
filtering and channel-induced distortion lead to ISI. 

Nyquist investigated and showed that theoretical minimum system bandwidth needed in order to detect 
R s symbols/s, without ISI, is R s /2 or 1/2T hertz. For baseband systems, when H (/) is such a filter with 
single-sided bandwidth 1/2T (the ideal Nyquist filter) as shown in figure 2a, its impulse response is of 
the form h(t) = sinc(t/T), shown in figure 2b. This sine (t/T)-shaped pulse is called the ideal Nyquist 
pulse. Even though two successive pulses h (t) and h(t — T) with long tail, the figure shows all tail of h (t) 
passing through zero amplitude at the instant when h(t — T) is to be sampled. Therefore, assuming that 
the synchronization is perfect, there will be no ISI. 
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Figure 5.2: Nyquist channels for zero ISI. (a) Rectangular system transfer function H(f). (b) Received 
pulse shape h (t) = sine (t/T) 



Figure 2 Nyquist channels for zero ISI. (a) Rectangular system transfer function H(f). (b) Received pulse 
shape h (t) = sine (t/T) 

The names "Nyquist filter" and "Nyquist pulse" are often used to describe the general class of filtering 
and pulse-shaping that satisfy zero ISI at the sampling points. Among the class of Nyquist filters, the most 
popular ones are the raised cosine and root-raised cosine. 

A fundamental parameter for communication system is bandwidth efficiency, R/W bits/s/Hz. For ideal 
Nyquist filtering, the theoretical maximum symbol-rate packing without ISI is 2symbols/s/Hz. For example, 
with 64-ary PAM, M = 64 = 2 6 amplitudes, the theoretical maximum bandwidth efficiency is possible 
without ISI is 6bits/symbol.2symbols/s/Hz = 12bits/s/Hz. 

5.3 Pulse Amplitude Modulation Through Bandlimited Channel 3 



Consider a PAM system b_io,. • ., 6-i, fro &iv • • 
This implies 

(oo N 

X t = 2_\ a n S (^ _ U T) 
n= — oo j 

The received signal is 



(5.5) 



n 



Ho Sr=-oo «n* (t-(T- nT)) g (r) dr + N t 



„=-oo On JZo s{t-{r- nT)) g (r) dr + N t 



(5.6) 



= EZ-oo*n§(t-nT) + N t 
Since the signals span a one-dimensional space, one filter matched to s (t) = sg (t) is sufficient. 



3 This content is available online at <http://cnx.org/content/ml0094/2.7/>. 
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The matched filter's impulse response is 

Vt:(h°v t (t) = sg(T-t)) (5.7) 

The matched filter output is 

V(t) = !T 00 Y.n=- 00 ^~s{t-{r-nT))h^{r)dT + V {t) 

= T,Z-oo^I-oo S (t-(r-nT))h^(r)dT + u(t) (5.8) 

= E^-oo a n u(t-nT) + u(t) 

The decision on the k th symbol is obtained by sampling the MF output at kT: 

oo 

y(kT)= J2 Q> n u(kT-nT) + v(kT) (5.9) 

n= — oo 

The k th symbol is of interest: 

oo 

y (kT) = a k u (0) + ^ a n u (kT - nT) + v (kT) (5.10) 



n= — oo 



where n ^ k. 

Since the channel is bandlimited, it provides memory for the transmission system. The effect of old 
symbols (possibly even future signals) lingers and affects the performance of the receiver. The effect of 
ISI can be eliminated or controlled by proper design of modulation signals or precoding filters at the 
transmitter, or by equalizers or sequence detectors at the receiver. 

5.4 Precoding and Bandlimited Signals 4 

5.4.1 Precoding 

The data symbols are manipulated such that 

y k (kT) = a k u (0) + ISI + v (kT) (5.11) 



5.4.2 Design of Bandlimited Modulation Signals 

Recall that modulation signals are 

CO 

X t = J2 a ns(t-nT) (5.12) 



We can design s (t) such that 

. large if n = 
u(nT) = { S (5.13) 

zero or small if n ^ 

where y (kT) = a k u (0) + Y^=-oo a ^ u (^ — n ^) + v (^) 0^1 18 ^ ne sum term, and once again, n ^ k .) 
Also, y (nT) = sgh opt (nT) The signal s (t) can be designed to have reduced ISI. 



4 This content is available online at <http://cnx.org/content/ml01 18/2. 6/>. 
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5.4.3 Design Equalizers at the Receiver 

Linear equalizers or decision feedback equalizers reduce ISI in the statistic y t 

5.4.4 Maximum Likelihood Sequence Detection 



y(kT)= J2 a n (kT-nT) + v(k(T)) (5.14) 



n= — oo 



By observing y (T) , y (2T) , . . . the date symbols are observed frequently. Therefore, ISI can be viewed as 
diversity to increase performance. 

5.5 Pulse Shaping to Reduce ISI 5 

The Raised-Cosine Filter 

Transfer function beloging to the Nyquist class (zero ISI at the sampling time) is called the raised-cosine 
filter. It can be express as 

1 \f\<2W -W 

H(f) = { cos 2 (f l^^T° ) 2W -W< \f\<W (la) 

o \f>w\ 

h(t) = 2Wosmc(2W t) 7^w-Zif ( lb ) 

Where W is the absolute bandwidth. Wo = 1/2T represent the minimum bandwidth for the rectangular 
spectrum and the -6 dB bandwith (or half-amplitude point) for the raised-cosine spectrum. W — Wo is 
termed the "excess bandwith" 

The roll-off factor is defined to be r = W ^° (2), where < r < 1 

With the Nyquist constrain Wo = R s /2 equation (2) can be rewriten as 

W=\(l + r)R s 



5 This content is available online at <http://cnx.org/content/ml5520/!. 2/>. 
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Figure 5.3: Raised-cosine filter characteristics, (a) System transfer function, (b) System impulse 
response 



The raised-cosine characteristic is illustrate in figure lforr = 0,r = 0.5,r = 1. When r = 1, the required 
excess bandwidth is 100 %, and the system can provide a symbol rate of R s symbols/s using a bandwidth 
of R s herts (twice the Nyquist minimum bandwidth), thus yielding asymbol-rate packing 1 symbols/s/Hz. 

The lager the filter roll-off, the shorter will be the pulse tail. Small tails exhibit less sensitivity to timing 
errors and thus make for small degradation due to ISI. 

The smaller the filter roll-off the smaller will be the excess bandwidth. The cost is longer pulse tails, 
larger pulse amplitudes, and thus, greater sensitivity to timing errors. 
The Root Raised-Cosine Filter 

Recall that the raised-cosine frequency transfer function describes the composite H (f) including trans- 
mitting filter, channel filter and receiving filter. The filtering at the receiver is chosen so that the overall 
transfer function is a form of raised-cosine. Often this is accomplished by choosing both the receiving filter 
and the transmitting filter so that each has a transfer function known as a root raised cosine. Neglecting 
any channel-induced ISI, the product of these root-raised cosine functions yields the composite raised-cosine 
system transfer function. 



5.6 Two Types of Error-Performance Degradation 6 

Error-performance degradation can be classifyed in two group. The first one is due to a decrease in received 
signal power or an increase in noise or inteference power, giving rise to a loss in signal-to-noise ratio Eb/Nq. 
The second one is due to signal distortion such as ISI. 



6 This content is available online at <http://cnx.org/content/ml5527/!. 2/>. 
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Figure 5.4: Bit error probability 



Suppose that we need a communication system with a bit-error probability Pb versus E^/Nq characteristic 
corresponding to the solid-line curve plotted in figure 1. Suppose that after the system is configured, the 
performance dose not follow the theoretical curve, but in facts follows the dashed line plot (1). A loss in 
Ei, /No due to some signal losses or an increased level of noise or interference. This loss in Eb/N is not so 
terrible when compared with possible effects of degradation caused by a distortion mechanism corresponding 
to the dashed line plot (2). Instead of suffering a simple loss in signal-to-noise ratio there is a degradation 
effect brought about by ISI. If there is no solution to this problem, there is no a mount of Eb/Nq that will 
improve this problem. More Eb/Nq can not help the ISI problem because a incresing in Eb/Nq dose not 
make change in overlapped pulses. 



5.7 Eye Pattern 7 

An eye pattern is the display that results from measuring a system' s response to baseband signals in a 
prescribed way. 



7 This content is available online at <http://cnx.org/content/ml5521/!. 2/>. 
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Optimum sampling time 




Figure 5.5: Eye pattern 



Figure 1 describe the eye pattern that results for binary binary pulse signalling. The width of the 
opening indicates the time over which sampling for detection might be performed. The optimum sampling 
time corresponds to the maxmum eye opening, yielding the greatest protection against noise. If there were 
no filtering in the system then the system would look like a box rather than an eye. In figure 1, Da, the 
range of amplitude differences of the zero crossings, is a measure of distortion caused by ISI. 

Jtj the range of amplitude differences of the zero crossing , is a measure of the timmung jitter. Mjy is a 
measure of noise margin. St is mesuare of sensity-to-timing error. 

In general, the most frequent use of the eye pattern is for qualitatively assessing the extent of the ISI. 
As the eye closes, ISI is increase; as the eye opens, ISI is decreaseing. 



5.8 Transversal Equalizer 8 

A training sequence used for equalization is often chosen to be a noise-like sequence which is needed to 
estimate the channel frequency response. 

In the simplest sense, training sequence might be a single narrow pulse, but a pseudonoise (PN) signal 
is preferred in practise because the PN signal has larger average power and hence larger SNR for the same 
peak transmitted power. 



8 This content is available online at <http://cnx.org/content/ml5522/!. 4/>. 
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Figure 5.6: Received pulse exhibiting distortion 



Consider that a single pulse was transmitted over a system designated to have a raised-cosine transfer 
function Hjiq (t) = H t (/) .H r (/), also consider that the channel induces ISI, so that the received demod- 
ulated pulse exhibits distortion, as shown in figure 1, such that the pulse sidelobes do not go through zero 
at sample times. To achieve the desired raised-cosine transfer function, the equalizing filter should have a 
frequency response 

n eyj) ~ H c (f) - \H c (f)\ e W 

In other words, we would like the equalizing filter to generate a set of canceling echoes. The transversal 
filter, illustrated in figure 2, is the most popular form of an easily adjustable equalizing filter consisting of a 
delay line with T-second taps (where T is the symbol duration). The tab weights could be chosen to force 
the system impulse response to zero at all but one of the sampling times, thus making H e (/) correspond 
exactly to the inverse of the channel transfer function H c (/) 
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Figure 5.7: Transversal filter 



y* 



Consider that there are 27V + 1 taps with weights c_jv>C-iv+i> •••C/v . Output samples z(k) are the 
convolution the input sample x (k) and tap weights c n as follows: 
* ( k ) = En=-N x(k-n) c n k = -27V, ...27V(2) 
By defining the vectors z and c and the matrix x as respectively, 

x(-N) ... 

x{-N + l) x(-N) 



z(-2N) 



:(0) 



C-N 



Co 



x(N) x(N-l) x(N-2) ... x(-N + l) x{-N) 



z(2N) J [ c N J x(N) 

... 

We can describe the relationship among z (k), x (k) and c n more compactly as 
z = x.c(3a) 
Whenever the matrix x is square, we can find c by solving the following equation: 

c = x~ 1 z(3b) 



x(N -1) 
x(N) 
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Notice that the index k was arbitrarily chosen to allow for 47V + 1 sample points. The vectors z and c 
have dimensions 47V + 1 and 27V + 1. Such equations are referred to as an overdetermined set. This problem 
can be solved in deterministic way known as the zero-forcing solution, or, in a statistical way, known as the 
minimum mean-square error (MSE) solution. 
Zero-Forcing Solution 

At first, by disposing top N rows and bottom N rows, matrix x is transformed into a square matrix of 
dimension 27V + 1 by 27V + 1. Then equation c = x~ l z is used to solve the 27V + 1 simultaneous equations 
for the set of 27V + 1 weights c n . This solution minimizes the peak ISI distortion by selecting the C n weight 
so that the equalizer output is forced to zero at N sample points on either side of the desired pulse. 

1 k = 

z(k) = { (4) 

fc = ±l,±2,±3 

For such an equalizer with finite length, the peak distortion is guaranteed to be minimized only if the 
eye pattern is initially open. However, for high-speed transmission and channels introducing much ISI, the 
eye is often closed before equalization. Since the zero-forcing equalizer neglects the effect of noise, it is not 
always the best system solution. 
Minimum MSE Solution 

A more robust equalizer is obtained if the c n tap weights are chose to minimize the mean-square error 
(MSE) of all the ISI term plus the noise power at the out put of the equalizer. MSE is defined as the expected 
value of the squared difference between the desire data symbol and the estimated data symbol. 

By multiplying both sides of equation (4) by x T , we have 

x T z = x T xc(5) 

And 

i? xz = i? xx C (6) 

Where i? xz = x T z is called the cross-correlation vector and i? xx = x T x is call the autocorrelation matrix of 
the input noisy signal. In practice, R xz and i? xx are unknown, but they can be approximated by transmitting 
a test signal and using time average estimated to solve for the tap weights from equation (6) as follows: 

C -^ xx -**ocz 

Most high-speed telephone-line modems use an MSE weight criterion because it is superior to a zero- 
forcing criterion; it is more robust in the presence of noise and large ISI. 

5.9 Decision Feedback Equalizer 9 

The basic limitation of a linear equalizer, such as the transversal filter, is the poor perform on channel 
having spectral nulls. A decision feedback equalizer (DFE) is a nonlinear equalizer that uses previous 
detector decision to eliminate the ISI on pulses that are currently being demodulated. In other words, the 
distortion on a current pulse that was caused by previous pulses is subtracted. 



9 This content is available online at <http://cnx.org/content/ml5524/!. 4/>. 
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Figure 5.8: Decision feedback Equalizer 



Figure 1 shows a simplified block diagram of a DFE where the forward filter and the feedback filter can 
each be a linear filter, such as transversal filter. The nonlinearity of the DFE stems from the nonlinear 
characteristic of the detector that provides an input to the feedback filter. The basic idea of a DFE is that 
if the values of the symbols previously detected are known, then ISI contributed by these symbols can be 
canceled out exactly at the output of the forward filter by subtracting past symbol values with appropriate 
weighting. The forward and feedback tap weights can be adjusted simultaneously to fulfill a criterion such 
as minimizing the MSE. 

The advantage of a DFE implementation is the feedback filter, which is additionally working to remove 
ISI, operates on noiseless quantized levels, and thus its output is free of channel noise. 



5.10 Adaptive Equalization 10 

Another type of equalization, capable of tracking a slowly time- varying channel response, is known as adap- 
tive equalization. It can be implemented to perform tap- weight adjustments periodically or continually. 
Periodic adjustments are accomplished by periodically transmitting a preamble or short training sequence of 
digital data known by the receiver. Continual adjustment are accomplished by replacing the known training 
sequence with a sequence of data symbols estimated from the equalizer output and treated as known data. 
When performed continually and automatically in this way, the adaptive procedure is referred to as decision 
directed. 

If the probability of error exceeds one percent, the decision directed equalizer might not converge. A 
common solution to this problem is to initialize the equalizer with an alternate process, such as a preamble 



°This content is available online at <http://cnx.org/content/ml5523/!. 2/>. 
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to provide good channel-error performance, and then switch to decision-directed mode. 

The simultaneous equations described in equation (3) of module "Transversal Equalizer 11 ", do not include 
the effects of channel noise. To obtain stable solution to the filter weights, it is necessary that the data be 
averaged to obtain the stable signal statistic, or the noisy solution obtained from the noisy data must be 
averaged. The most robust algorithm that average noisy solution is the least-mean-square (LMS) algorithm. 
Each iteration of this algorithm uses a noisy estimate of the error gradient to adjust the weights in the 
direction to reduce the average mean-square error. 

The noisy gradient is simply the product e (k) r x of an error scalar e (fc)and the data vector r x . 

e(k) = z(k)-z(k) (1) 

Where z (k) and z (k) are the desired output signal (a sample free of ISI) and the estimate at time k. 

z (k) = c T r x = ^Zn=-N x(k-n) c n (2) 

Where c T is the transpose of the weight vector at time k. 

Iterative process that updates the set of weights is obtained as follows: 

c(fc + l) = c(k) + Ae(k)r x (3) 

Where c (k) is the vector of filter weights at time k, and A is a small term that limits the coefficient step 
size and thus controls the rate of convergence of the algorithm as well as the variance of the steady state 
solution. Stability is assured if the parameter A is smaller than the reciprocal of the energy of the data in 
the filter. Thus, while we want the convergence parameter A to be large for fast convergence but not so 
large as to be unstable, we also want it to be small enough for low variance. 



1 http://cnx.org/content/ml5522/latest/ 
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Chapter 6 

Chapter 5: Channel Coding 



6.1 Channel Capacity 1 

In the previous section, we discussed information sources and quantified information. We also discussed how 
to represent (and compress) information sources in binary symbols in an efficient manner. In this section, 
we consider channels and will find out how much information can be sent through the channel reliably. 

We will first consider simple channels where the input is a discrete random variable and the output is 
also a discrete random variable. These discrete channels could represent analog channels with modulation 
and demodulation and detection. 



discrete 



symbols 



h> 



Modulation — Channel 



Demodulation — Detection 



symbols 



Discrete Channel 
Figure 6.1 



Let us denote the input sequence to the channel as 

( Xi \ 

x = 

\Xn J 

where I^Gla discrete symbol set or input alphabet. 

lr rhis content is available online at <http://cnx.Org/content/ml0173/2. 8/>. 
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The channel output 
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(Y \ 

Y 3 



(6.2) 



\Yn J 



where FjGFa discrete symbol set or output alphabet. 

The statistical properties of a channel are determined if one finds Py|x (y| x ) f° r all 2/ E Y" and for all 
x E X . A discrete channel is called a discrete memoryless channel if 



Py|x (y|x) = JJpy.ix. (Vi\xi) 



(6.3) 



for all y E Y and for all x E X . 

Example 6.1 

A binary symmetric channel (BSC) is a discrete memoryless channel with binary input and binary 
output and py\x (y = 0| x = 1) = Py\x (y = l| x = 0)- As an example, a white Gaussian channel 

with antipodal signaling and matched filter receiver has probability of error of Q ( \ ^§^ )• Since 
the error is symmetric with respect to the transmitted bit, then 



Py\x (0\1) 



PY\X(1\0) 



(6.4) 



1 -e 




L -e 



output 



Figure 6.2 



It is interesting to note that every time a BSC is used one bit is sent across the channel with probability 
of error of e. The question is how much information or how many bits can be sent per channel use, reli- 
ably. Before we consider the above question a few definitions are essential. These are discussed in mutual 
information (Section 6.2). 
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6.2 Mutual Information 2 

Recall that 



H ( X , Y ) = ~^2^2px,y (x,y)\ogp x ,Y 0,2/) 

xx yy 

H(Y) + H (X\Y) = H{X) + H (Y\X) 



(6.5) 
(6.6) 



Definition 6.1: Mutual Information 

The mutual information between two discrete random variables is denoted by X (X; Y) and defined 
as 

X (X; Y) = H(X)-H (X\Y) (6.7) 

Mutual information is a useful concept to measure the amount of information shared between input 
and output of noisy channels. 

In our previous discussions it became clear that when the channel is noisy there may not be reliable 
communications. Therefore, the limiting factor could very well be reliability when one considers noisy 
channels. Claude E. Shannon in 1948 changed this paradigm and stated a theorem that presents the rate 
(speed of communication) as the limiting factor as opposed to reliability. 

Example 6.2 

Consider a discrete memoryless channel with four possible inputs and outputs. 




Figure 6.3 



Every time the channel is used, one of the four symbols will be transmitted. Therefore, 2 bits are 
sent per channel use. The system, however, is very unreliable. For example, if "a" is received, the 
receiver can not determine, reliably, if "a" was transmitted or "d". However, if the transmitter and 
receiver agree to only use symbols "a" and "c" and never use "b" and "d", then the transmission 
will always be reliable, but 1 bit is sent per channel use. Therefore, the rate of transmission was 
the limiting factor and not reliability. 

This is the essence of Shannon's noisy channel coding theorem, i.e., using only those inputs whose corre- 
sponding outputs are disjoint (e.g., far apart). The concept is appealing, but does not seem possible with 
binary channels since the input is either zero or one. It may work if one considers a vector of binary inputs 
referred to as the extension channel. 



2 This content is available online at <http://cnx.org/content/ml0178/2.9/>. 
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X input vector 



Y output vector 
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Figure 6.4 



This module provides a description of the basic information necessary to understand Shannon's Noisy 
Channel Coding Theorem (Section 6.4). However, for additional information on typical sequences, please 
refer to Typical Sequences (Section 6.3). 



6.3 Typical Sequences 3 

If the binary symmetric channel has crossover probability e then if x is transmitted then by the Law of Large 
Numbers the output y is different from x in ne places if n is very large. 



d>H (%, y) — ne 
The number of sequences of length n that are different from x of length n at ne is 



(6.8) 



n 

ne 



(ne)\ (n — ne)\ 



(6.9) 



Example 6.3 

x = (0, 0, 0) and e = | and ne = 3 x |. The number of output sequences different from x by one 
element: ^ = ^ff 1 = 3 given by (1, 0, 1) T , (0, 1, 1) T , and (0, 0, 0) T . 
Using Stirling's approximation 



(6.10) 



3 This content is available online at <http://cnx.Org/content/ml0179/2.10/> . 



we can approximate 



27i((-(elog 2 e))-(l-e)log 2 (l-e)) _ 2 nH b{t) 
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(6.11) 



ne 



where H^ (e) = (— (elog 2 e)) — (1 — e) log 2 (1 — e) is the entropy of a binary memoryless source. For any x 
there are 2 nHb ^ highly probable outputs that correspond to this input. 

Consider the output vector Y as a very long random vector with entropy nH (Y). As discussed earlier 
(Example 3.1), the number of typical sequences (or highly probably) is roughly 2 nH ^ Y \ Therefore, 2 n is the 
total number of binary sequences, 2 nH ^ is the number of typical sequences, and 2 nHb ^ is the number of 
elements in a group of possible outputs for one input vector. The maximum number of input sequences that 
produce nonoverlapping output sequences 



M 



2nH(Y) 

2 nH b^) 
2 n(H(Y)-H b (e)) 



(6.12) 




typical sequence > 

as the result 

of input 

X. 



nuntypical 
sequence 



Figure 6.5 



The number of distinguishable input sequences of length n is 

2 n(H(Y)-H b (e)) 



(6.13) 



The number of information bits that can be sent across the channel reliably per n channel uses 
n (H (Y) — Hi) (e)) The maximum reliable transmission rate per channel use 



R 



?M 



n(H(Y)-H b (e)) 
n 

H(Y)-H b (e) 



(6.14) 



The maximum rate can be increased by increasing H (Y). Note that H^ (e) is only a function of the crossover 
probability and can not be minimized any further. 

The entropy of the channel output is the entropy of a binary random variable. If the input is chosen to 
be uniformly distributed with px (0) = px (1) = |. 
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Then 



Py(0) = lpx (0) + e Px (1) 



and 



p Y (l) = lpx (1) + epx (0) 



(6.15) 



(6.16) 



Then, # (F) takes its maximum value of 1. Resulting in a maximum rate R = 1 — H^ (e) when px (0) = 
Px (1) = \- This result says that ordinarily one bit is transmitted across a BSC with reliability 1 — e. If 
one needs to have probability of error to reach zero then one should reduce transmission of information to 
1 — H^ (e) and add redundancy. 

Recall that for Binary Symmetric Channels (BSC) 



H(Y\X) = p x (0)H(Y\X = 0)+p x (l)H(Y\X = l) 

= p x (0) (- ((1 - e) log 2 (1 - e) - elog 2 e)) + p x (1) (- ((1 - e) log 2 (1 - e) - elog 2 e)) 

= (-((l-e)log 2 (l-e)))-elog 2 e 

= H h (e) 

Therefore, the maximum rate indeed was 



(6.17) 



R = H(Y)-H(Y\X) 
= 1(X;Y) 



(6.18) 



Example 6.4 

The maximum reliable rate for a BSC is 1 — H^ (e). The rate is 1 when e = or e = 1. The rate 
is when e= \ 




Figure 6.6 



This module provides background information necessary for an understanding of Shannon's Noisy Chan- 
nel Coding Theorem (Section 6.4). It is also closely related to material presented in Mutual Information 
(Section 6.2). 
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6.4 Shannon's Noisy Channel Coding Theorem 4 

It is highly recommended that the information presented in Mutual Information (Section 6.2) and in Typical 
Sequences (Section 6.3) be reviewed before proceeding with this document. An introductory module on the 
theorem is available at Noisy Channel Theorems 5 . 

Theorem 6.1: Shannon's Noisy Channel Coding 

The capacity of a discrete-memoryless channel is given by 

C = max pX ^{l(X;Y)\p x (x)} (6.19) 

where T{X\ Y) is the mutual information between the channel input X and the output Y. If the 
transmission rate R is less than C, then for any e > there exists a code with block length n large 
enough whose error probability is less than e. If R > C, the error probability of any code with any 
block length is bounded away from zero. 

Example 6.5 

If we have a binary symmetric channel with cross over probability 0.1, then the capacity C ~ 0.5 
bits per transmission. Therefore, it is possible to send 0.4 bits per channel through the channel 
reliably. This means that we can take 400 information bits and map them into a code of length 
1000 bits. Then the whole code can be transmitted over the channels. One hundred of those bits 
may be detected incorrectly but the 400 information bits may be decoded correctly. 

Before we consider continuous-time additive white Gaussian channels, let's concentrate on discrete-time 
Gaussian channels 

Yi = Xi + rn (6.20) 

where the JQ's are information bearing random variables and rji is a Gaussian random variable with variance 
a 2 The input X^s are constrained to have power less than P 



n 
n ^ — ' 



< P (6.21) 



Y = X + ri (6.22) 



Consider an output block of size n 
For large n, by the Law of Large Numbers, 

-j n i n 

-Y,m 2 = -Y,(\vi-*i\?<°r, 2 ( 6 - 23 ) 



n * — ' n 

i=l i=l 



This indicates that with large probability as n approaches infinity, Y will be located in an n-dimensional 
sphere of radius ^na^ 2 centered about X since (\y — x\) < na^ 2 

On the other hand since JQ's are power constrained and rji and JQ's are independent 

1 n 

-I>* 2 < P *+ V (6.24) 

n i=i 

\Y\ <n(P + a v 2 ) (6.25) 



This mean Y is in a sphere of radius ^n (P + a^ 2 ) centered around the origin. 

4 This content is available online at <http://cnx.Org/content/ml0180/2.10/> . 
5 "Noisy Channel Coding Theorem" <http://cnx.org/content/m0073/latest/> 
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How many X's can we transmit to have nonoverlapping Y spheres in the output domain? The question 
is how many spheres of radius ^na v 2 fit in a sphere of radius ^n(P + cr^ 2 ). 



M 



(V™?) n (6.26) 




Figure 6.7 



Exercise 6.1 (Solution on p. 99.) 

How many bits of information can one send in n uses of the channel? 

The capacity of a discrete-time Gaussian channel C = |log 2 (l + -^ ) bits per channel use. 

When the channel is a continuous-time, bandlimited, additive white Gaussian with noise power spectral 
density -£- and input power constraint P and bandwidth W . The system can be sampled at the Nyquist 
rate to provide power per sample P and noise power 



WN 



(6.27) 



The channel capacity |log 2 ( 1 + jf^y) bits per transmission. Since the sampling rate is 2VF, then 

2W ( P \ 

C = — — - log 2 1 + ^rr-— bits/trans, x trans. /sec (6.28) 

2 \ NqW J 

Example 6.6 

The capacity of the voice band of a telephone channel can be determined using the Gaussian model. 
The bandwidth is 3000 Hz and the signal to noise ratio is often 30 dB. Therefore, 

C = 30001og 2 (1 + 1000) ~ 30000-^ (6.30) 

One should not expect to design modems faster than 30 Kbs using this model of telephone channels. 
It is also interesting to note that since the signal to noise ratio is large, we are expecting to transmit 
10 bits/second/Hertz across telephone channels. 
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6.5 Channel Coding 6 

Channel coding is a viable method to reduce information rate through the channel and increase reliability. 
This goal is achieved by adding redundancy to the information symbol vector resulting in a longer coded 
vector of symbols that are distinguishable at the output of the channel. Another brief explanation of channel 
coding is offered in Channel Coding and the Repetition Code 7 . We consider only two classes of codes, block 
codes (Section 6.5.1: Block codes) and convolutional codes (Section 6.6). 

6.5.1 Block codes 

The information sequence is divided into blocks of length k. Each block is mapped into channel inputs of 
length n. The mapping is independent from previous blocks, that is, there is no memory from one block to 
another. 

Example 6.7 

k = 2 and n = 5 

00 -> 00000 (6.31) 

01 -► 10100 (6.32) 

10 -► 01111 (6.33) 

11 -► 11011 (6.34) 

information sequence =4> codeword (channel input) 
A binary block code is completely defined by 2 k binary sequences of length n called codewords. 

C = {ci,c 2 ,...,c 2fc } (6.35) 

d G {0, l} n (6.36) 

There are three key questions, 

1. How can one find "good" codewords? 

2. How can one systematically map information sequences into codewords? 

3. How can one systematically find the corresponding information sequences from a codeword, i.e., how 
can we decode? 

These can be done if we concentrate on linear codes and utilize finite field algebra. 

A block code is linear if q G C and Cj G C implies C{ Cj G C where is an elementwise modulo 2 
addition. 

Hamming distance is a useful measure of codeword properties 

d>H (ci, Cj) = # of places that they are different (6.37) 



6 This content is available online at <http://cnx.Org/content/ml0174/2.l l/>. 
7n Repetition Codes" <http://cnx.org/content/m0071/latest/> 
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Denote the codeword for information sequence e\ 



/o\ 







v 1 / 



/l\ 







V / 



by gi and e 2 



/o\ 

1 







by ^2,.--, and e k 



by gk- Then any information sequence can be expressed as 



and the corresponding codeword could be 



Therefore 



with c = {0, l} n and u € {0, 1} where G 



Example 6.8 

In Example 6.7 with 



( uA 

\ u k J 

Ek 

k 

C = ^2 U i9i 

i=l 

c = uG 
( 91 \ 

92 



(6.38) 



(6.39) 
(6.40) 



, a kxn matrix and all operations are modulo 2. 



V 9k j 

00 -► 00000 

01 -> 10100 
10^01111 
11 -^ 11011 



(6.41) 

(6.42) 
(6.43) 
(6.44) 



97 



0i = (0, 1, 1, 1, 1) T and 2 = (1, 0, 1, 0, 0) T and G 



1111 
10 10 



Additional information about coding efficiency and error are provided in Block Channel Coding 8 . 

Examples of good linear codes include Hamming codes, BCH codes, Reed-Solomon codes, and many 
more. The rate of these codes is defined as - and these codes have different error correction and error 

n 

detection properties. 

6.6 Convolutional Codes 9 

Convolutional codes are one type of code used for channel coding (Section 6.5). Another type of code used 
is block coding (Section 6.5.1: Block codes). 

6.6.1 Convolutional codes 

In convolutional codes, each block of k bits is mapped into a block of n bits but these n bits are not only 
determined by the present k information bits but also by the previous information bits. This dependence 
can be captured by a finite state machine. 

Example 6.9 

A rate | convolutional coder k = 1, n = 2 with memory length 2 and constraint length 3. 



00100111 



>© — > 



110000101011101000011101 



Figure 6.8 



Since the length of the shift register is 2, there are 4 different rates. The behavior of the 
convolutional coder can be captured by a 4 state machine. States: 00, 01, 10, 11, 
For example, arrival of information bit transitions from state 10 to state 01. 
The encoding and the decoding process can be realized in trellis structure. 



8 "Block Channel Coding" <http://cnx.org/content/m0094/latest/> 

9 This content is available online at <http://cnx.org/content/ml0181/2.7/>. 
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Figure 6.9 

If the input sequence is 

110 
the output sequence would be 

11 10 10 11 
The transmitted codeword is then 11 10 10 11. If there is one error on the channel 11 00 10 11 
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1 1 




Figure 6.10 



Starting from state 00 the Hamming distance between the possible paths and the received 
sequence is measured. At the end, the path with minimum distance to the received sequence is 
chosen as the correct trellis path. The information sequence will then be determined. 

Convolutional coding lends itself to very efficient trellis based encoding and decoding. They are very 
practical and powerful codes. 
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Solutions to Exercises in Chapter 6 

Solution to Exercise 6.1 (p. 94) 



lo 8'2 S i + 7T ) ( 6 - 45 ) 



(Jn 
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Chapter 7 

Chapter 6: Communication over Fading 
Channels 

7.1 Fading Channel 1 

For most channels, where signal propagate in the atmosphere and near the ground, the free-space propagation 
model is inadequate to describe the channel behavior and predict system performance. In wireless system, 
s signal can travel from transmitter to receiver over multiple reflective paths. This phenomenon, called 
multipath fading, can cause fluctuations in the received signal's amplitude, phase, and angle of arrival, 
giving rise to the terminology multipath fading. Another name, scintillation, is used to describe the fading 
caused by physical changes in the propagating medium, such as variations in the electron density of the 
ionosopheric layers that reflect high frequency radio signals. Both fading and scintillation refer to a signal's 
random fluctuations. 

7.2 Characterizing Mobile- Radio Propagation 2 

Characterizing Mobile-Radio Propagation 



lr rhis content is available online at <http://cnx.Org/content/ml5525/l. 2/>. 
2 This content is available online at <http://cnx.org/content/ml5528/!. 2/>. 
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Figure 7.1: Fading channel manifestations 



Figure 1 introduces an overview of fading channel. Large-scale fading represents the average power 
attenuation or the path loss due to motion over large areas. This phenomenon is affected by prominent 
terrain contours (e.g. hills, forests, billboards, clumps of buildings, etc) between the transmitter and receiver. 
Small-scale fading refers to the dramatic changes in signal amplitude and phase as a result of small changes 
(as small as half wavelength) in the spatial positioning between a receiver and transmitter. Small-scale fading 
is called Rayleigh fading if there are multiple reflective paths and no line-of-sight signal component otherwise 
it is called Rician. When a mobile radio roams over a large area it must process signals that experience both 
types of fading: small-scale fading superimposed on large-scale fading. Large-scale fading (attenuation or 
path loss) can be considered as a spatial average over the small-scale fluctuations of the signal. 

There are three basic mechanisms that impact signal propagation in a mobile communication system: 



1. Reflection occurs when a propagating electromagnetic wave impinges upon smooth surface with very 
large dimensions relative to the RF signal wavelength. 

2. Diffraction occurs when the propagation path between the transmitter and receiver is obstructed by a 
dense body with dimensions that are large relative to the RF signal wavelength. Diffraction accounts 
for RF energy traveling from transmitter to receiver without line-of-sight path. It is often termed 
shadowing because the diffracted field can reach the receiver even when shadowed by an impenetrable 
obstruction. 

3. Scattering occurs when a radio wave impinges on either a large, rough surface or any surface whose 
dimension are on the other of the RF signal wavelength or less, causing the energy to be spread out or 



103 



reflected in all directions. 
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Figure 7.2: Link budget considerations for a fading channel 



Figure 2 is a convenient pictorial showing the various contributions that must be considered when estimating 
path loss for link budget analysis in a mobile radio application: (1) mean path loss as a function of distance, 
due to large-scale fading, (2) near-worst-case variations about the mean path loss or large-scale fading margin 
(typically 6-10 dB), (3) near- worst-case Rayleigh or small-scale fading margin (typically 20-30 dB) 

Using complex notation 

s(t) = Re{g(t).e^c t }(l) 

Where Re{.} denotes the real part of {.}, and f c is the carrier frequency. The baseband waveform g (t) 
is called the complex envelope of s (t) and can be expressed as 

g(t)=\g(t) | .e j< ^ =R(t).e^ t \2) 

Where R(t) =\ g (t) | is the envelope magnitude, and (j){t) is its phase. 
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In fading environment, g(t) will be modified by a complex dimentionless multiplicative factor a (t) .e~^ e ^\ 
The modified baseband waveform can be written as a (t) .e~ }d( ^\g (t). The magnitude of this envelope can 
be expressed as follow 

a (t) .R (t) = m (t) .r (t) .R (t) (3) 

Where m (t) and ro (t) are called the large-scale-fading component and the large-scale-fading component 
of the envelope respectively. 

Sometimes, m (t) is referred to as the local mean or log-normal fading, and r$ (t) is referred to as multipath 
or Rayleigh fading. 

For the case of mobile radio, figure 3 illustrates the relationship between a(t) .m(t). In figure 3a, the 
signal power received is a function of the multiplicative factor a (t). Small-scale fading superimposed on large- 
scale fading can be readily identified. The typical antenna displacement between adjacent signal-strength 
nulls due to small-scale fading is approximately half of wavelength. In figure 3b, the large-scale fading or 
local mean m (t) has been removed in order to view the small-scale fading vq (t). The log-normal fading is a 
relative slow varying function of position, while the Rayleigh fading is a relatively fast varying function of 
position. 
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Figure 7.3: Large-scale fading and small-scale fading 



7.3 Large-Scale Fading 3 

In general, propagation models for both indoor and outdoor radio channels indicate that mean path loss as 
follow 

Lp (d) ~ [U+E09E] d/d [U+E09F] n (l) 

Z~ p (d) dB = L s (d ) dB + 10ra.log (d/d ) (2) 



3 This content is available online at <http://cnx.org/content/ml5526/!. 2/>. 
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Where d is the distance between transmitter and receiver, and the reference distance do corresponds to 
a point located in the far field of the transmit antenna. Typically, do is taken 1 km for large cells, 100 m for 
micro cells, and 1 m for indoor channels. Moreover do is evaluated using equation 

L s (d ) = [U+E09E] ^ [U+E09F] 2 (3) 

or by conducting measurement. The value of the path-loss exponent n depends on the frequency, antenna 
height and propagation environment. In free space, n is equal to 2. In the presence of a very strong guided 
wave phenomenon (like urban streets), n can be lower than 2. When obstructions are present, n is larger. 

Measurements have shown that the path loss L p is a random variable having a log-normal distribution 
about the mean distant-dependent value L p (d) 

L p (d) (dB) = L s (do) (dB) + 10nlog 10 (d/d ) + X a (dB)(4) 

Where X a denote a zero-mean, Gaussian random variable (in dB) with standard deviation [U+F073] (in 
dB). X a is site and distance dependent. 

As can be seen from the equation, the parameters needed to statistically describe path loss due to large- 
scale fading, for an arbitrary location with a specific transmitter-receiver separation are (1) the reference 
distance, (2) the path-loss exponent, and (3) the standard deviation X G . 

7.4 Small-Scale Fading 4 

SMALL - SCALE FADING 

Small-scale fading refers to the dramatic changes in signal amplitude and phase that can be experienced 
as a result of small changes (as small as half wavelength) in the spatial position between transmitter and 
receiver. 

In this section, we will develop the small-scale fading component r*o (t). Analysis proceeds on the as- 
sumption that the antenna remains within a limited trajectory so that the effect of large-scale fading m(t) 
is constant. Assume that the antenna is traveling and there are multiple scatter paths, each associated with 
a time- variant propagation delay r n (t) and a time variant multiplicative factor a n (t). Neglecting noise, the 
received bandpass signal can be written as below: 

r(t) = En<*n(t)s(t-T n (t))(l) 

Substituting Equation (1, module Characterizing Mobile- Radio Propagation) over into Equation (1), we 
can write the received bandpass signal as follow: 

r (t) =Re ((£„ a n (t) g{t-r n (t)) e J2^(*-r„(t))) (2) 

= ^ ((£„ a n (t) e-^^Wg (t - r„ (t))) e^' 

We have the equivalent received bandpass signal is 

s(t) = E»«»Wec" iMT " W 9(<-r n (t))(3) 

Consider the transmission of an unmodulated carrier at frequency f c or in other words, for all time, 
g(t)=l. So the received bandpass signal become as follow: 

s (*) = En «n (*) e-J^-M = £„ a n (t) e-J*»W(4) 

The baseband signal s(t) consists of a sum of time- variant components having amplitudes a n (t) and 
phases n (t). Notice that n (t) will change by 2i\ radians whenever r n (t) changes by 1/ f c (very small 
delay). These multipath components combine either constructively or destructively, resulting in amplitude 
variations of s(t). Final equation is very important because it tell us that a bandpass signal s(t) is the signal 
that experienced the fading effects and gave rise to the received signal r(t), these effects can be described by 
analyzing r(t) at the baseband level. 



4 This content is available online at <http://cnx.org/content/ml5531/!. l/>. 
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p(ro) = { 



exp 



/o(^#) r >0,,4>0 



When the received signal is made up of multiple reflective arrays plus a significant line-of-sight (non- 
faded) component, the received envelope amplitude has a Rician pdf as below, and the fading is preferred to 
as Rician fading 

(r 2 o+A 2 ) ' 

■ .t.t\ \ — "o - i / n "-" \j , j i ---" \j 

(5) 
otherwise 

The parameter a 2 is the pre-detection mean power of the multipath signal. A denotes the peak magnitude 
of the non-faded signal component and I (— ) is the modified Bessel function. The Rician distribution is often 
described in terms of a parameter K, which is defined as the ratio of the power in the specular component 
to the power in the multipath signal. It is given by K = A 2 /2a 2 . 

When the magnitude of the specular component A approach zero, the Rician pdf approachs a Rayleigh 
pdf, shown as 



p(ro) = { 



rexp 



'0 

'2a 2 



(6) 



r >0 

otherwise 

The Rayleigh pdf results from having no specular signal component, it represents the pdf associated with 
the worst case of fading per mean received signal power. 

Small scale manifests itself in two mechanisms - time spreading of signal (or signal dispersion) and 
time- variant behavior of the channel (figure 2). It is important to distinguish between two different time 
references- delay time r and transmission time t. Delay time refers to the time spreading effect resulting 
from the fading channel's non-optimum impulse response. The transmission time, however, is related to the 
motion of antenna or spatial changes, accounting for propagation path changes that are perceived as the 
channel's time-variant behavior. 
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Figure 7.5 



7.5 Signal Time-Spreading 5 

SIGNAL TIME - SPREADING 

Signal Time- Spreading Viewed in the Time-Delay Domain 

A simple way to model the fading phenomenon is proposed the notion wide-sense stationary uncorrected 
scattering. The model treats arriving at a receive antenna with different delay as uncorrected. 

In Figure 1(a), a multipath-intensity profile S(r) is plotted. S(r) helps us understand how the average 
received power vary as a function of time delay r. The term "time delay" is used to refer to the excess delay. 
It represents the signal's propagation delay that exceeds the delay of the first signal arrival at the receiver. In 
wireless channel, the received signal usually consists of several discrete multipath components causing S(r). 
For a single transmitted impulse, the time T m between the first and last received component represents the 
maximum excess delay. 



5 This content is available online at <http://cnx.org/content/ml5533/!. 3/>. 
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Degradation Categories due to Signal Time- Spreading Viewed in the Time-Delay Domain 

In a fading channel, the relationship between maximum excess delay time T m and symbol time T s can be 
viewed in terms of two different degradation categories: frequency-selective fading and frequency nonselective 
or flat fading. 

A channel is said to exhibit frequency selective fading if T m > T s . This condition occurs whenever 
the received multipath components of a symbol extend beyond the symbol's time duration. In fact, another 
name for this category of fading degradation is channel-induced ISI. In this case of frequency-selective fading, 
mitigating the distortion is possible because many of the multipath components are resolved by receiver. 

A channel is said to exhibit frequency nonselective or flat fading if T m < T s . In this case, all of the 
received multipath components of a symbol arrive within the symbol time duration; hence, the components 
are not resolvable. There is no channel-induced ISI distortion because the signal time spreading does not 
result in significant overlap among neighboring received symbols. 

Signal Time-Spreading Viewed in the Frequency Domain 

A completely analogous characterization of signal dispersion can be specified in the frequency domain. 
In figure lb, the spaced-frequency correlation function | R (Af) | can be seen, it is the Fourier transform of 
S(t). The correlation function | i?(Af) | represents the correlation between the response of channel to two 
signals as a function of the frequency difference between two signals. The function | R ( Af ) | helps answer 
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the correlation between received signals that are spaced in the frequency Af = f\ — f 2 is what. | R (Af) | can 
be measured by transmitting a pair of sinusoids separated in frequency by Af, cross-correlating the complex 
spectra of two separated received signals, and repeating the process many times with ever-larger separation 
Af. Spectral components in that range are affected by the channel in a similar manner. Note that the 
coherence bandwidth fo and the maximum excess delay time T m are related as approximation below 

/o - £(1) 

A more useful parameter is the delay spread, most often characterized in terms of its root-mean-square 

(rms) value, can be calculated as 

/ \ 1/2 

<x.= (r 2 -r 2 ) (2) 

Where r is the mean excess delay, ( r J is the mean squared, r 2 is the second moment and a T is the 

square root of the second central moment of S(r). 

A relationship between coherence bandwidth and delay spread does not exist. However, using Fourier 
transform techniques an approximation can be derived from actual signal dispersion measurements in various 
channel. Several approximate relationships have been developed. 

If the coherence bandwidth is defined as the frequency interval over which the channel's complex frequency 
transfer function has a correlation of at least 0.9, the coherent bandwidth is approximately 

fo ~ 5^( 3 ) 

With the dense-scatterer channel model, coherence bandwidth is defined as the frequency interval over 

which the channel's complex frequency transfer function has a correlation of at least 0.5, to be 

/o « 5^(4) 

Studies involving ionospheric effects often employ the following definition 

/o « 5Z^(5) 

The delay spread and coherence bandwidth are related to a channel's multipath characteristic, differing 
for different propagation paths. It is important to note that all parameters in last equation independent of 
signaling speed, a system's signaling speed only influences its transmission bandwidth W. 

Degradation Categories due to Signal Time- Spreading Viewed in the Frequency Domain 

A channel is preferred to as frequency-selective if f < 1/T S w W (the symbol rate is taken to be equal to 
the signaling rate or signal bandwidth W). Frequency selective fading distortion occurs whenever a signal's 
spectral components are not all affected equally by the channel. Some of the signal's spectra components 
failing outside the coherent bandwidth will be affected differently, compared with those components contained 
within the coherent bandwidth (Figure 2(a)). 

Frequency- nonselective of flat-fading degradation occurs whenever / > W. hence, all of signal's spectral 
components will be affected by the channel in a similar manner (fading or non-fading) (Figure 2(b)). Flat 
fading does not introduce channel-induced ISI distortion, but performance degradation can still be expected 
due to the loss in SNR whenever the signal is fading. In order to avoid channel-induced ISI distortion, the 
channel is required to exhibit flat fading. This occurs, provide that 

/o > W « f 

(6) 

Hence, the channel coherent bandwidth fO set an upper limit on the transmission rate that can be used 
without incorporating an equalizer in the receiver. 

However, as a mobile radio changes its position, there will be times when the received signal experiences 
frequency-selective distortion even though fo>W (in Figure 2(c)). When this occurs, the baseband pulse 
can be especially mutilated by deprivation of its low-frequency components. Thus, even though a channel is 
categorized as flat-fading, it still manifests frequency-selective fading. 
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Examples of Flat Fading and Frequency-Selective Fading 

The signal dispersion manifestation of the fading channel is analogous to the signal spreading that charac- 
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terizes an electronic filter. Figure 3(a) depicts a wideband filter (narrow impulse response) and its effect on a 
signal in both time domain and the frequency domain. This filter resembles a flat-fading channel yielding an 
output that is relatively free of dispersion. Figure 3(b) shows a narrowband filter (wide impulse response). 
The output signal suffers much distortion, as shown both time domain and frequency domain. Here the 
process resembles a frequency-selective channel. 
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7.6 Mitigating the Degradation Effects of Fading 6 

Figure 1 highlights three major performance categories in terms of bit-error probability P# versus E^/Nq 



6 This content is available online at <http://cnx.org/content/ml5535/!. l/>. 
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The leftmost exponentially shaped curve highlights the performance that can be expected when using 
any nominal modulation scheme in AWGN interference. Observe that at a reasonable E^/Nq level, good 
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performance can be expected. 

The middle curve, referred to as the Rayleigh limit, shows the performance degradation resulting 
from a loss in E^/Nq that is characteristic of flat fading or slow fading when there is no line-of-sight signal 
component present. The curve is a function of the reciprocal of E^/Nq, so for practical values of E^/Nq, 
performance will generally be "bad." 

The curve that reaches an irreducible error-rate level, sometimes called an error floor, represents "awful" 
performance, where the bit-error probability can level off at values nearly equal to 0.5. This shows the severe 
performance degrading effects that are possible with frequency-selective fading or fast fading. 

If the channel introduces signal distortion as a result of fading, the system performance can exhibit an 
irreducible error rate at a level higher than the desired error rate. In such cases, the only approach available 
for improving performance is to use some forms of mitigation to remove or reduce the signal distortion. 

The mitigation method depends on whether the distortion is caused by frequency-selective fading or fast 
fading. Once the signal distortion has been mitigated, the P# versus E^/Nq performance can transition from 
the "awful" category to the merely "bad" Rayleigh-limit curve. 

Next, it is possible to further ameliorate the effects of fading and strive to approach AWGN system 
performance by using some form of diversity to provide the receiver with a collection of uncorrelated replicas 
of the signal, and by using a powerful error-correction code. 

Figure 2 lists several mitigation techniques for combating the effects of both signal distortion and loss 
in SNR. The mitigation approaches to be used when designing a system should be considered in two basic 
steps: 

1) choose the type of mitigation to reduce or remove any distortion degradation; 

2) choose a diversity type that can best approach AWGN system performance. 
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7.7 Mitigation to Combat Frequency-Selective Distortion 7 

Equalization can mitigate the effects of channel-induced ISI brought on by frequency-selective fading. It can 
help modify system performance described by the curve that is "awful" to the one that is merely "bad." The 
process of equalizing for mitigating ISI effects involves using methods to gather the dispersed symbol energy 
back into its original time interval. 

An equalizer is an inverse filter of the channel. If the channel is frequency selective, the equalizer enhances 
the frequency components with small amplitudes and attenuates those with large amplitudes. The goal is 
for the combination of channel and equalizer filter to provide a flat composite-received frequency response 
and linear phase. 

Because the channel response varies with time, the equalizer filters must be adaptive equalizers. 

The decision feedback equalizer (DFE) involves: 

1) a feedforward section that is a linear transversal filter whose stage length and tap weights are selected 
to coherently combine virtually all of the current symbol's energy. 

2) a feedback section that removes energy remaining from previously detected symbols. 

The basic idea behind the DFE is that once an information symbol has been detected, the ISI that it 
induces on future symbols can be estimated and subtracted before the detection of subsequent symbols. 



7 This content is available online at <http://cnx.org/content/ml5537/!. l/>. 
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A maximum- likelihood sequence estimation (MLSE) equalizer: tests all possible data sequences 
and chooses the data sequence that is the most probable of all the candidates. The MLSE is optimal in the 
sense that it minimizes the probability of a sequence error. Since the MLSE equalizer is implemented by 
using Viterbi decoding algorithm, it is often referred to as the Viterbi equalizer. 

Direct-sequence spread-spectrum (DS/SS) techniques can be used to mitigate frequency-selective 
ISI distortion because the hallmark of spread-spectrum systems is their capability of rejecting interference, 
and ISI is a type of interference. 

Consider a DS/SS binary phase-shift keying (PSK) communication channel comprising one direct path 
and one reflected path. Assume that the propagation from transmitter to receiver results in a multipath 
wave that is delayed by r compared to the direct wave. The received signal, r(t), neglecting noise, can be 
expressed as follows: 

r (t) = Ax (t) g (t) cos (2tt/ c £) + aAx (t-r)g(t- r) cos {2nf c t + 0) 

where x (t) is the data signal, g (t) is the pseudonoise (PN) spreading code, and r is the differential 
time delay between the two paths. The angle is a random phase, assumed to be uniformly distributed in 
the range (0, 2ir), and a is the attenuation of the multipath signal relative to the direct path signal. 

The receiver multiplies the incoming r (t) by the code g (t). If the receiver is synchronized to the direct 
path signal, multiplication by the code signal yields the following: 

r (t) g (t) = Ax (t) g 2 (t) cos (2tt/ c £) + aAx (t - r) g (t) g{t-r) cos (2?r/ c t + 6) 

where g 2 (t) = 1. If r is greater than the chip duration, then 

l/ 5 (i) 5 (t-r)dt|<|jy(i)dt| 

over some appropriate interval of integration (correlation). Thus, the spread spectrum system effectively 
eliminates the multipath interference by virtue of its code-correlation receiver. Even though channel-induced 
ISI is typically transparent to DS/SS systems, such systems suffer from the loss in energy contained in the 
multipath components rejected by the receiver. The need to gather this lost energy belonging to a received 
chip was the motivation for developing the Rake receiver. 

A channel that is classified as flat fading can occasionally exhibit frequency-selective distortion when the 
null of the channel's frequency-transfer function occurs at the center of the signal band. The use of DS/SS 
is a practical way of mitigating such distortion because the wideband SS signal can span many lobes of 
the selectively faded channel frequency response. This requires the spread-spectrum bandwidth W ss (or the 
chip rate R C h), to be greater than the coherence bandwidth / . The larger the ratio of W ss to / , the more 
effective will be the mitigation. 

Frequency-hopping spread-spectrum (FH/SS): can be used to mitigate the distortion caused by 
frequency-selective fading, provided that the hopping rate is at least equal to the symbol rate. FH receivers 
avoid the degradation effects due to multipath by rapidly changing in the transmitter carrier-frequency band, 
thus avoiding the interference by changing the receiver band position before the arrival of the multipath signal. 

Orthogonal frequency-division multiplexing (OFDM): can be used for signal transmission in 
frequency-selective fading channels to avoid the use of an equalizer by lengthening the symbol duration. The 
approach is to partition (demultiplex) a high symbol-rate sequence into TV symbol groups, so that each group 
contains a sequence of a lower symbol rate (by the factor 1/N) than the original sequence. The signal band 
is made up of TV orthogonal carrier waves, and each one is modulated by a different symbol group. The 
goal is to reduce the symbol rate (signaling rate), W w 1/T S , on each carrier to be less than the channel's 
coherence bandwidth /q. 

Pilot signal is the name given to a signal intended to facilitate the coherent detection of waveforms. 
Pilot signals can be implemented in the frequency domain as in-band tones, or in the time domain as digital 
sequences that can also provide information about the channel state and thus improve performance in fading 
conditions. 
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7.8 Mitigation to Combat Fast-Fading Distortion 8 

• For fast-fading distortion, use a robust modulation (non-coherent or differentially coherent) that does 
not require phase tracking, and reduces the detector integration time. 

• Increase the symbol rate, W « l/T^, to be greater than the fading rate, fd w 1/T , by adding signal 
redundancy. 

• Error-correction coding and interleaving can provide mitigation, because instead of providing more 
signal energy, a code reduces the required E^/Nq. For a given E^/Nq with coding present, the error 
floor will be lowered compared to the uncoded case. 

When fast-fading distortion and frequency-selective distortion occur simultaneously, the frequency-selective 
distortion can be mitigated by the use of an OFDM signal set. Fast fading, however, will typically degrade 
conventional OFDM because the Doppler spreading corrupts the orthogonality of the OFDM subcarriers. A 
polyphase filtering technique is used to provide time-domain shaping and partial-response coding to reduce 
the spectral sidelobes of the signal set, and thus help preserve its orthogonality. The process introduces 
known ISI and adjacent channel interference (ACI) which are then removed by a post-processing equalizer 
and canceling filter. 

7.9 Mitigation to Combat Loss in SNR 9 

Until this point, we have considered the mitigation to combat frequency-selective and fast-fading distortions. 
The next step is to use diversity methods to move the system operating point from the error-performance 
curve labeled as "bad" to a curve that approaches AWGN performance. The term diversity is used to denote 
the various methods available for providing the receiver with uncorrelated renditions of the signal of interest. 
Some of the ways in which diversity methods can be implemented are: 

• Time diversity: transmit the signal on L different time slots with time separation of at least To. 
When used along with error-correction coding, interleaving is a form of time diversity. 

• Frequency diversity: transmit the signal on L different carriers with frequency separation of at least 
/o- Bandwidth expansion is a form of frequency diversity. The signal bandwidth W is expanded so as to be 
greater than /o, thus providing the receiver with several independently-fading signal replicas. This achieves 
frequency diversity of the order L = W/fo- 

Whenever W is made larger than / , there is the potential for frequency-selective distortion unless 
mitigation in the form of equalization is provided. 

Thus, an expanded bandwidth can improve system performance (via diversity) only if the frequency- 
selective distortion that the diversity may have introduced is mitigated. 

• Spread spectrum: In spread-spectrum systems, the delayed signals do not contribute to the fading, 
but to interchip interference. Spread spectrum is a bandwidth-expansion technique that excels at rejecting 
interfering signals. In the case of Direct-Sequence Spread-Spectrum (DS/SS), multipath components 
are rejected if they are time-delayed by more than the duration of one chip. However, in order to approach 
AWGN performance, it is necessary to compensate for the loss in energy contained in those rejected compo- 
nents. The Rake receiver makes it possible to coherently combine the energy from several of the multipath 
components arriving along different paths (with sufficient differential delay). 

• Frequency-hopping spread-spectrum (FH/SS) is sometimes used as a diversity mechanism. The 
GSM system uses slow FH (217 hops/s) to compensate for cases in which the mobile unit is moving very 
slowly (or not at all) and experiences deep fading due to a spectral null. 

• Spatial diversity is usually accomplished through the use of multiple receive antennas, separated by 
a distance of at least 10 wavelengths when located at a base station (and less when located at a mobile unit). 
Signal-processing techniques must be employed to choose the best antenna output or to coherently combine 
all the outputs. Systems have also been implemented with multiple transmitters, each at a different location. 



8 This content is available online at <http://cnx.Org/content/ml5536/l. l/>. 
9 This content is available online at <http://cnx.org/content/ml5538/!. l/>. 
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• Polarization diversity is yet another way to achieve additional uncorrected samples of the signal. 

• Some techniques for improving the loss in SNR in a fading channel are more efficient and more powerful 
than repetition coding. 

Error-correction coding represents a unique mitigation technique, because instead of providing more 
signal energy it reduces the required E^/Nq needed to achieve a desired performance level. Error-correction 
coding coupled with interleaving is probably the most prevalent of the mitigation schemes used to provide 
improved system performance in a fading environment. 

7.10 Diversity Techniques 10 

This section shows the error-performance improvements that can be obtained with the use of diversity 
techniques. 

The bit-error-probability, P#, averaged through all the "ups and downs" of the fading experience in a 
slow-fading channel is as follows: 

Pb = J Pb (x) p (x) dx 

where Pb (x) is the bit-error probability for a given modulation scheme at a specific value of SNR = x, 
where x = o^E^/No, and p(x) is the pdf of x due to the fading conditions. With Eb and N constant, a is 
used to represent the amplitude variations due to fading. 

For Rayleigh fading, a has a Rayleigh distribution so that a 2 , and consequently x, have a chi- 
squared distribution: 

p(x) = ^exp (— f ) x > 

where T = oPE^/N^ is the SNR averaged through the "ups and downs" of fading. If each diversity 
(signal) branch, i = 1, 2, ...,M, has an instantaneous SNR = 7^, and we assume that each branch has the 
same average SNR given by T, then 

p(ii) = r ex P(~?) 7* > ° 

The probability that a single branch has SNR less than some threshold 7 is: 

P{H < 7) = P,p{li)^i = Q ^exp (-f) d 7 , 

= l-exp(-J) 

The probability that all M independent signal diversity branches are received simultaneously with an 
SNR less than some threshold value 7 is: 

P(7i,-,7M<7)=[l-exp(-^)] M 

The probability that any single branch achieves SNR > 7 is: 

P( 7i > 7 ) = l-[l-exp(-J)] M 

This is the probability of exceeding a threshold when selection diversity is used. 

Example: Benefits of Diversity 

Assume that four-branch diversity is used, and that each branch receives an independently Rayleigh- 
fading signal. If the average SNR is T = 20 dB, determine the probability that all four branches are received 
simultaneously with an SNR less than 10 dB (and also, the probability that this threshold will be exceeded). 

Compare the results to the case when no diversity is used. 

Solution 

With 7 = 10 dB, and 7/T = 10 dB - 20 dB = -10 dB = 0.1, we solve for the probability that the 

SNR will drop below 10 dB, as follows: 

P (71, 72, 73, 74 < 10 dB) = [1 - exp(-O.l)] 4 = 8.2 x 10" 5 

or, using selection diversity, we can say that 

P (ji > 10 dB) = 1 - 8.2 x 10~ 5 = 0.9999 

Without diversity, 

P(7i < 10 dB) = [l-exp(-O.l)] 1 =0.095 

P (71 > 10 dB) = 1 - 0.095 = 0.905 



°This content is available online at <http://cnx.org/content/ml5540/!. l/>. 
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7.11 Diversity-Combining Techniques 11 

The most common techniques for combining diversity signals are selection, feedback, maximal ratio, 
and equal gain. 

Selection combining used in spatial diversity systems involves the sampling of M antenna signals, and 
sending the largest one to the demodulator. Selection-diversity combining is relatively easy to implement 
but not optimal because it does not make use of all the received signals simultaneously. 

With feedback or scanning diversity, the M signals are scanned in a fixed sequence until one is found that 
exceeds a given threshold. This one becomes the chosen signal until it falls below the established threshold, 
and the scanning process starts again. The error performance of this technique is somewhat inferior to the 
other methods, but feedback is quite simple to implement. 

In maximal-ratio combining, the signals from all of the M branches are weighted according to their 
individual SNRs and then summed. The individual signals must be cophased before being summed. 

Maximal-ratio combining produces an average SNR jm equal to the sum of the individual average SNRs, 
as shown below: 

where we assume that each branch has the same average SNR given by 7^ = T. 

Thus, maximal-ratio combining can produce an acceptable average SNR, even when none of the individual 
i 7 is acceptable. It uses each of the M branches in a cophased and weighted manner such that the largest 
possible SNR is available at the receiver. 

Equal-gain combining is similar to maximal-ratio combining except that the weights are all set to unity. 
The possibility of achieving an acceptable output SNR from a number of unacceptable inputs is still retained. 
The performance is marginally inferior to maximal ratio combining. 

7.12 Modulation Types for Fading Channels 12 

An amplitude-based signaling scheme such as amplitude shift keying (ASK) or quadrature amplitude 

modulation (QAM) is inherently vulnerable to performance degradation in a fading environment. Thus, 
for fading channels, the preferred choice for a signaling scheme is a frequency or phase-based modulation 
type. 

In considering orthogonal FSK modulation for fading channels, the use of MFSK with M = 8 or larger 
is useful because its error performance is better than binary signaling. In slow Rayleigh fading channels, 
binary DPSK and 8-FSK perform within 0.1 dB of each other. 

In considering PSK modulation for fading channels, higher-order modulation alphabets perform poorly. 
MPSK with M = 8 or larger should be avoided. 

Example: Phase Variations in a Mobile Communication System 

The Doppler spread fa = V/X shows that the fading rate is a direct function of velocity. Table 1 shows 
the Doppler spread versus vehicle speed at carrier frequencies of 900 MHz and 1800 MHz. Calculate the 
phase variation per symbol for the case of signaling with QPSK modulation at the rate of 24.3 kilosymbols/s. 

Assume that the carrier frequency is 1800 MHz and that the velocity of the vehicle is 50 miles/hr (80 
km/hr). Repeat for a vehicle speed of 100 miles/hr. 
Table 1 



11 This content is available online at <http://cnx.Org/content/ml5541/l. l/>. 
12 This content is available online at <http://cnx.org/content/ml5539/!. l/>. 
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Velocity 




Doppler (Hz) 


Doppler (Hz) 


miles/hr 


km/hr 


900 Mhz (A = 33cm) 


1800 Mhz (A = 16.6cm) 


3 


5 


4 


8 


20 


32 


27 


54 


50 


60 


66 


132 


80 


108 


106 


212 


120 


192 


160 


320 



Table 7.1 



Solution 

At a velocity of 100 miles/hr: 
A#/symbol 



j± x 360° 

132 Hz ' x36 qo 



24.3xl0 3 symbols/s 

= 2°/symbol 

At a velocity of 100 miles/hr: A#/symbol = 4°/symbol 

Thus, it should be clear why MPSK with a value of M > 4 is not generally used to transmit information 
in a multipath environment. 



7.13 The Role of an Interleave!- 1 



The primary benefit of an interleaver for transmission in fading environment is to provide time diversity 
(when used along with error-correction coding). 

Figure 1 illustrates the benefits of providing an interleaver time span Til, that is large compared to the 
channel coherence time To, for the case of DBPSK modulation with soft-decision decoding of a rate 1/2, 
K = 7 convolutional code, over a slow Ray leigh- fading channel. 



3 This content is available online at <http://cnx.org/content/ml5542/!. l/>. 
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It should be apparent that an interleaver having the largest ratio of Tjl/Tq is the best-performing (large 
demodulated BER leading to small decoded BER). This leads to the conclusion that Til/T should be some 
large number — say 1,000 or 10,000. However, in a real-time communication system this is not possible 
because the inherent time delay associated with an interleaver would be excessive. 

The previous section shows that for a cellular telephone system with a carrier frequency of 900 MHz, a 
Tjl/Tq ratio of 10 is about as large as one can implement without suffering excessive delay. 

Note that the interleaver provides no benefit against multipath unless there is motion between the trans- 
mitter and receiver (or motion of objects within the signal-propagating paths). The system error-performance 
over a fading channel typically degrades with increased speed because of the increase in Doppler spread or 
fading rapidity. However, the action of an interleaver in the system provides mitigation, which becomes more 
effective at higher speeds 

Figure 2 show that communications degrade with increased speed of the mobile unit (the fading rate 
increases), the benefit of an interleaver is enhanced with increased speed. This is the results of field testing 
performed on a CDMA system meeting the Interim Specification 95 (IS-95) over a link comprising a 
moving vehicle and a base station. 
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Typical E^/Nq performance versus vehicle speed for 850 MHz links to achieve a frame-error rate of 1 
percent over a Rayleigh channel with two independent paths 
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7.14 Application of Viterbi Equalizer in GSM System 14 

The GSM time- division multiple access (TDMA) frame in Figure 1 has duration of 4.615 ms and 
comprising 8 slots, one assigned to each active mobile user. A normal transmission burst occupying one time 
slot contains 57 message bits on each side of a 26-bit midamble, called a training or sounding sequence. 
The slot-time duration is 0.577 ms (or the slot rate is 1733 slots/s). The purpose of the midamble is to assist 
the receiver in estimating the impulse response of the channel adaptively (during the time duration of each 
0.577 ms slot). For the technique to be effective, the fading characteristics of the channel must not change 
appreciably during the time interval of one slot. 
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Consider a GSM receiver used aboard a high-speed train, traveling at a constant velocity of 200 km/hr 
(55.56 m/s). Assume the carrier frequency to be 900 MHz (the wavelength is A = 0.33 m). The distance 
corresponding to a half- wavelength is traversed in T « -y- « 3 corresponds approximately to the coherence 
time. Therefore, the channel coherence time is more than five times greater than the slot time of 0.577 ms. 
The time needed for a significant change in channel fading characteristics is relatively long compared to the 
time duration of one slot. 

The GSM symbol rate (or bit rate, since the modulation is binary) is 271 kilosymbols/s; the bandwidth, 
W, is 200 kHz. Since the typical rms delay spread a r in an urban environment is on the order of 2/xs, then 
the resulting coherence bandwidth: 

/ « ^ ^100 kHz 

Since /o < W , the GSM receiver must utilize some form of mitigation to combat frequency-selective 
distortion. To accomplish this goal, the Viterbi equalizer is typically implemented. 



4 This content is available online at <http://cnx.org/content/ml5544/!. l/>. 
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Figure 2 shows the basic functional blocks used in a GSM receiver for estimating the channel impulse 
response. 
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This estimate is used to provide the detector with channel-corrected reference waveforms as explained 
below: (the Viterbi algorithm is used in the final step to compute the MLSE of the message bits) 

Let s tr (t) be the transmitted midamble training sequence, and r tr (t) be the corresponding received 
midamble training sequence. We have: 

rtr (t) = s tr (t) * h c (t) 

At the receiver, since r tr (t) is part of the received normal burst, it is extracted and sent to a filter having 
impulse response h m f (t) , that is matched to s tr (t). This matched filter yields at its output an estimate of 
h c (£), denoted h e (t): 

h e (t) = r tr (t) * ft m f (t) 
= S tr (t) * h c (t) * ft mf (t) 

where i? s (t) = s tr (t) * fe m f (t) is the autocorrelation function of s tr (t). If s tr (t) is designed to have a 
highly-peaked (impulse-like) autocorrelation function R s (t), then fe e (t) w /i c (t). 

Next, we use a windowing function, w (t), to truncate h e (t) to form a computationally affordable function, 
h w (t). The time duration of w (t), denoted Lq, must be large enough to compensate for the effect of typical 
channel-induced ISI. The term Lq consists of the sum of two contributions, namely I/cisi, corresponding to 
the controlled ISI caused by Gaussian filtering of the baseband waveform (which then modulates the carrier 
using MSK), and Lc, corresponding to the channel-induced ISI caused by multipath propagation. Thus, 

Lo = Lcisi + Lc 

The GSM system is required to provide distortion mitigation caused by signal dispersion having delay 
spreads of approximately 15-20 /is. Since in GSM the bit duration is 3.69 /is, we can express Lo in units of 
bit intervals. Thus, the Viterbi equalizer used in GSM has a memory of 4-6 bit intervals. For each Lo-bit 
interval in the message, the function of the Viterbi equalizer is to find the most likely L -bit sequence out 
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of the 2 L ° possible sequences that might have been transmitted. 

Determining the most likely transmitted Lo-bit sequence requires that 2 L ° meaningful reference waveforms 
be created by disturbing) the 2 L ° ideal waveforms (generated at the receiver) in the same way that the 
channel has disturbed the transmitted slot. Therefore, the 2 L ° reference waveforms are convolved with the 
windowed estimate of the channel impulse response, h w (t) in order to generate the disturbed or so-called 
channel-corrected reference waveforms. 

Next, the channel-corrected reference waveforms are compared against the received data waveforms to 
yield metric calculations. However, before the comparison takes place, the received data waveforms are 
convolved with the known windowed autocorrelation function w (t) R s (£), transforming them in a manner 
comparable to the transformation applied to the reference waveforms. This filtered message signal is com- 
pared to all possible 2 L ° channel-corrected reference signals, and metrics are computed in a manner similar 
to that used in the Viterbi decoding algorithm. It yields the maximum likelihood estimate of the 
transmitted data sequence. 

7.15 Application of Rake Receiver in CDMA System 15 

Interim Specification 95 (IS-95) describes a Direct-Sequence Spread-Spectrum (DS/SS) cellular system 
that uses a Rake receiver to provide path diversity for mitigating the effects of frequency-selective fading. 
The Rake receiver searches through the different multipath delays for code correlation and thus recovers 
delayed signals that are then optimally combined with the output of other independent correlators. 

Figure 1 show the power profiles associated with the five chip transmissions of the code sequence 10 1 
1 1. Each abscissa shows three components arriving with delays ti, t 2 , and r 3 . Assume that the intervals 
between the transmission times U and the intervals between the delay times Ti are each one chip in duration. 
The component arriving at the receiver at time t_4, with delay T3, is time-coincident with two others, namely 
the components arriving at times ts and £_ 2 with delays r 2 and t\ respectively. Since in this example the 
delayed components are separated by at least one chip time, they can be resolved. 



5 This content is available online at <http://cnx.org/content/ml5534/!. 2/>. 
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At the receiver, there must be a sounding device dedicated to estimating the T{ delay times. Note that the 
fading rate in mobile radio system is relatively slow (in the order of milliseconds) or the channel coherence 
time large compared to the chip time duration ( T > T C h). Hence, the changes in Ti occur slowly enough 
that the receiver can readily adapt to them. 

Once the T{ delays are estimated, a separate correlator is dedicated to recovering each resolvable multipath 
component. In this example, there would be three such dedicated correlators, each one processing a delayed 
version of the same chip sequence 10 111. Each correlator receives chips with power profiles represented by 
the sequence of components shown along a diagonal line. For simplicity, the chips are all shown as positive 
signaling elements. In reality, these chips form a pseudonoise (PN) sequence, which of course contains 
both positive and negative pulses. Each correlator attempts to correlate these arriving chips with the same 
appropriately synchronized PN code. At the end of a symbol interval (typically there may be hundreds 
or thousands of chips per symbol), the outputs of the correlators are coherently combined, and a symbol 
detection is made. 

The interference-suppression capability of DS/SS systems stems from the fact that a code sequence 
arriving at the receiver time-shifted by merely one chip will have very low correlation to the particular PN 
code with which the sequence is correlated. Therefore, any code chips that are delayed by one or more chip 
times will be suppressed by the correlator. The delayed chips only contribute to raising the interference level 
(correlation sidelobes). 

The mitigation provided by the Rake receiver can be termed path diversity, since it allows the energy of 
a chip that arrives via multiple paths to be combined coherently. Without the Rake receiver, this energy 
would be transparent and therefore lost to the DS/SS receiver. 
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Glossary 



A antipodal 

Signals s\ (t) and 52 (t) are antipodal if Vt, t G [0,T] : (52 (t) = — s\ (t)) 
Autocorrelation 

The autocorrelation function of the random process X t is defined as 

Rx (h,ti) = E [X t2 X tl ] 

f-00 f-00 x ^f x t2 ,x tl (x 2 ,x 1 )dx ldx 2 if continuous (2.50) 

Y^T=-ooT,Z-oo x iXkP *t 2 ,x tl (x u x k ) if discrete 

Autocovariance 

Autocovariance of a random process is defined as 



C x (t 2 M) = E (X t2 - fi x (t 2 )) X tl - fi x (h) 



= Rx (*2, h) - l^x (t 2 ) fix (ti) 



(2.60) 



B biorthogonal 

Signals s± (£), s 2 (£),. . ., 5m (£) are biorthogonal if s\ (£),. . ., sm (t) are orthogonal and 
Sm (t) = -SM +m (t) for some m G {l,2, . . . , 4^}. 

C Conditional Entropy 

The conditional entropy of the random variable X given the random variable Y is defined by 

H(X\Y) = -^2^2px,y (x h y j )logpx\ Y (x i \y j ) (3.8) 

a n 

Continuous Random Variable 

A random variable X is continuous if the cumulative distribution function can be written in an 
integral form, or 

F x (b)=[ f x (x)dx (2.14) 

J — CO 

and f x (x) is the probability density function (pdf) (e.g., F x (x) is differentiate and 
/x (x) = £(F x (x))) 

Crosscorrelation 

The crosscorrelation function of a pair of random processes is defined as 

Rx Y (t2,h) = E[X t2 Y^] 

= SZoS-ooWfxt^ (x,y)dxdy 

Cxy (t 2 , h) = Rxy (t 2 , h) - ii X (t 2 ) fivih) (2.62) 
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Cumulative distribution 

The cumulative distribution function of a random variable X is a function F x (M i— > R) such 
that 

F x (b) = Pr[X<b] 



(2.13) 
Pr[{coen \X{u)) <b}} 



D Discrete Random Variable 



A random variable X is discrete if it only takes at most countably many points (i.e., F x (•) is 
piecewise constant). The probability mass function (pmf) is defined as 



Px (x k ) = Pr[X = x k ] 



(2.15) 



= Fx (x k ) ~ limit F x (x) 

x(x^x k ) A (x<x k ) 

E Entropy Rate 

The entropy rate of a stationary discrete-time random process is defined by 

if = limit H (X n \X 1 X 2 . . . X n ) (3.12) 

n— >oo 

The limit exists and is equal to 

ff = limit -H(X 1 ,X 2 ,...,X n ) (3.13) 

The entropy rate is a measure of the uncertainty of information content per output symbol of 
the source. 

Entropy 

1. The entropy (average self information) of a discrete random variable X is a function of its 
probability mass function and is defined as 

N 

H(X) = -^2px (xi)logpx (xi) (3.3) 

i=l 

where TV is the number of possible values of X and p x (xi) = Pr [X = Xi\. If log is base 2 
then the unit of entropy is bits. Entropy is a measure of uncertainty in a random variable and a 
measure of information it can reveal. 

2. A more basic explanation of entropy is provided in another module 16 . 

F First-order stationary process 

If Fx t (b) is not a function of time then X t is called a first-order stationary process. 

G Gaussian process 

A process with mean \±x (t) and covariance function Cx (^2^1) is said to be a Gaussian process 
if any X = (X tl ,X t2 , . . . , X tN ) formed by any sampling of the process is a Gaussian random 
vector, that is, 

fx (x) = — ^ _ e -(±{x-» x ) T v x -\x-vx)) ( 2<63 ) 

(2tt) 2 (detSx) 2 



"Entropy" <http://cnx.org/content/m0070/latest/> 
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GLOSSARY 



for all x G W 1 where 



I »x(ti) \ 



Hx 



and 



\ V>x (t N ) ) 
I C x {tiM) ... C x (t u t N ) \ 



^x 



\ Cx(tN,ti) ... Cx^n^n) ) 
. The complete statistical properties of X t can be obtained from the second-order statistics. 

J Joint Entropy 

The joint entropy of two discrete random variables (X, Y) is defined by 



H ( X , Y ) = ~^2^2px,y (x i ,y j )\ogp x ,Y (xi,yj) 



(3.6) 
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Jointly Wide Sense Stationary 

The random processes X t and Y t are said to be jointly wide sense stationary if Rxy (hih) is a 
function of t^ — t\ only and fix (t) and [iy (t) are constant. 

M Mean 

The mean function of a random process X t is defined as the expected value of X t for all t's. 



fi Xt = E[X t ] 

/_ x f x t ( x ) dx if continuous 

Y^k=-oo x kP x t (x k ) if discrete 



(2.49) 



Mutual Information 



The mutual information between two discrete random variables is denoted by X (X; Y) and 
defined as 

X (X; Y)=H{X)-H (X\Y) (6.7) 

Mutual information is a useful concept to measure the amount of information shared between 
input and output of noisy channels. 

O orthogonal 

Signals s\ (£), S2 (£),.. ., $m (t) are orthogonal if < s m , s n >= for m ^ n. 

P Power Spectral Density 

The power spectral density function of a wide sense stationary (WSS) process X t is defined to be 
the Fourier transform of the autocorrelation function of X t . 



/CO 
-co 

if X t is WSS with autocorrelation function Rx (r). 



(2.84) 
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S Simplex signals 

Let {si (t) , S2 (t) , . . . , sm (t)} be a set of orthogonal signals with equal energy. The signals 
si (£),. . ., sm (£) are simplex signals if 

1 M 
s~m(t) = s m (t)-—J2sk(t) (4.3) 



M 
fc=l 



Stochastic Process 



Given a sample space, a stochastic process is an indexed collection of random variables defined 
for each u; E ft. 

Vt,teR: (X t (u)) (2.30) 

U Uncorrelated random variables 

Two random variables X and Y are uncorrelated if pxy = 0. 

W Wide Sense Stationary 

A process is said to be wide sense stationary if \±x is constant and Rx (^2,^1) is only a function 
of £2 -h- 
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